WORLD INTELLECTUAL PROPERTY ORGANIZATION 
International Bureau 




PCT 

INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(51) International Patent Classification 6 
C07K 14/00 



A2 



(11) International Publication Number: WO 98/58953 

(43) International Publication Date: 30 December 1998 (30.12.98) 



(21) International Application Number: PCT/DK98/00266 

(22) International Filing Date: 19 June 1998 (19.06.98) 



(30) Priority Data: 
0744/97 



23 June 1997 (23.06.97) 



DK 



(71) (72) Applicants and Inventors: BIRKELUND, Svend 

[DK/DK]; Sotoften 26, DK-8250 Ega (DK). CHRIS- 
TIANSEN, Gunna [DK/DK]; Sotoften 26, DK-8250 Ega 
(DK). 

(72) Inventors; and 

(TSrim^tor^Applicants - (for ~US~only)l KN UDSEN"Katrine~ 
[DK/DK]; Lundingsgade 33, Lejlighed 407, DK-8000 
Arhus C (DK). MADSEN, Anna-Sofie [DK/DK]; Ramsh- 
erred 51 b, l.tv., DK-6200 Aabenraa (DK). MYGIND, Per 
[DK/DK]; Falstersgade 5, 3.tv. ( DK-8000 Arhus C (DK). 

(74) Agent: PLOUGMANN, VINGTOFT & PARTNERS A/S; 
Sankt Anna* Plads 11, P.O. Box 3007, DK-1021 Copen- 
hagen K (DK). 



(81) Designated States: AL, AM, AT, AT (Utility model), AU, AZ, 
BA, BB, BG, BR, BY, CA t CH, CN, CU, CZ, CZ (Utility 
model), DE, DE (Utility model), DK, DK (Utility model), 
EE, EE (Utility model), ES, FI, FI (Utility model), GB, GE, 
GH, GM, GW, HU, ID, IL f IS, JP, KE, KG, KP, KR, KZ, 
LC, LK, LR, LS, LT, LU, LV, MD, MG, MK, MN, MW, 
MX, NO, NZ, PL, PT, RO, RU, SD, SE, SG, SI, SK, SK 
(Utility model), SL, TJ, TM, TR, TT, UA, UG, US, UZ, 
VN. YU, ZW, ARIPO patent (GH, GM, KE, LS, MW, SD, 
SZ, UG, ZW), Eurasian patent (AM, AZ, BY, KG, KZ, MD, 
RU, TJ, TM), European patent (AT, BE, CH, CY, DE, DK, 
ES, FI, FR, GB, GR, IE, IT, LU, MC, NL, PT, SE), OAPI 
patent (BF, BJ, CF, CG, CI, CM, GA, GN, ML, MR, NE, 
SN, TD, TG). 



Published 

Without international search report and to be republished 
upon receipt of that report. 



(54) Title: NOVEL SURFACE EXPOSED PROTEINS FROM CHLAMYDIA PNEUMONIAE 
(57) Abstract 

The invention relates to the identification of members of a gene family from the human respiratory pathogen Chlamydia pneumoniae, 
encoding surface exposed membrane proteins of a size of approximately 89-101 kDa and of 56-57 kDa, preferably about 89.6-100.3 kDa 
and about 56.1 kDa. The invention relates to the novel DNA sequences, the deduced amino acid sequences of the corresponding proteins 
and the use of the DNA sequences and the proteins in diagnosis of infections caused by C. pneumoniae, in pathology, in epidemiology, and 
as vaccine components. 



FOR THE PURPOSES OF INFORMATION ONLY 



Codes used to identify States party to the PCT on the front pages of pamphlets publishing international applications under the PCT. 



AL 


Albania 


ES 


Spain 


LS 


Lesotho 


SI 


Slovenia 


AM 


Armenia 


FI 


Finland 


LT 


Lithuania 


SK 


Slovakia 


AT 


Austria 


FR 


France 


LU 


Luxembourg 


SN 


Senegal 


AU 


Australia 


GA 


Gabon 


LV 


Latvia 


sz 


Swaziland 


AZ 


Azerbaijan 


GB 


United Kingdom 


MC 


Monaco 


TD 


Chad 


BA 


Bosnia and Herzegovina 


GE 


Georgia 


MD 


Republic of Moldova 


TG 


Togo 


BB 


Barbados 


GH 


Ghana 


MG 


Madagascar 


TJ 


Tajikistan 


BE 


Belgium 


GN 


Guinea 


MK 


The former Yugoslav 


TM 


Turkmenistan 


BF 


Burkina Faso 


GR 


Greece 




Republic of Macedonia 


TR 


Turkey 


BG 


Bulgaria 


HU 


Hungary 


ML 


Mali 


TT 


Trinidad and Tobago 


BJ 


Benin 


IE 


Ireland 


MN 


Mongolia 


UA 


Ukraine 


BR 


Brazil 


IL 


Israel 


MR 


Mauritania 


UG 


Uganda 


BY 


Belarus 


IS 


Iceland 


MW 


Malawi 


US 


United States of America 


CA 


Canada 


IT 


Italy 


MX 


Mexico 


UZ 


Uzbekistan 


CF 


Central African Republic 


JP 


Japan 


NE 


Niger 


VN 


Viet Nam 


CG 


Congo 


KE 


Kenya 


NL 


Netherlands 


YU 


Yugoslavia 


CH 


Switzerland 


KG 


Kyrgyzstan 


NO 


Norway 


ZW 


Zimbabwe 


CI 


Cote d'lvoire 


KP 


Democratic People's 


NZ 


New Zealand 






CM 


Cameroon 




Republic of Korea 


PL 


Poland 






CN 


China 


KR 


Republic of Korea 


PT 


Portugal 






cu 


Cuba 


KZ 


Kazakstan 


RO 


Romania 






cz 


Czech Republic 


LC 


Saint Lucia 


RU 


Russian Federation 






DE 


Germany 


LI 


Liechtenstein 


SD 


Sudan 






DK 


Denmark 


LK 


Sri Lanka 


SE 


Sweden 






EE 


Estonia 


LR 


Liberia 


SG 


Singapore 







WO 98/58953 PCT/DK98/00266 

1 

NOVEL SURFACE EXPOSED PROTEINS FROM CHLAMYDIA PNEUMONIAE 



The present invention relates to the identification of 
members of a gene family from the human respiratory pathogen 
Chlamydia pneumoniae, encoding surface exposed membrane 
5 proteins of a size of approximately 89-101 kDa and of 56-57 
kDa, preferably about 8 9.6-100.3 kDa and about 56.1- kDa. The 
invention relates to the novel DNA sequences, the deduced 
amino acid sequences of the corresponding proteins and the 
use of the DNA sequences and the proteins in diagnosis of 
10 infections caused by C, pneumoniae, in pathology, in 
epidemiology, and as vac cine components . 

GENERAL BACKGROUND 

C. pneumoniae is an obligate intracellular bacteria 
(Christiansen and Birkelund (1992); Grayston et al . (1986)). 

15 It has a cell wall structure as Gram negative bacteria with 
an outer membrane, a periplasmic space, and a cytoplasmic 
membrane. It is possible to purify the outer membrane from 
Gram negative bacteria with the detergent sarkosyl . This 
fraction is named the 'outer membrane complex (OMC) ' (Caldwell 

20 et al. (1981)). The COMC (Chlamydia outer membrane complex) 
of C. pneumoniae contains four groups of proteins: A high 
molecular weight protein 9 8 kDa as determined by SDS-PAGE, a 
double band of the cysteine rich outer membrane protein 2 
(Omp2) protein of 62/60 kDa, the major outer membrane protein 

25 (MOMP) of 38 kDa, and the low-molecular weight lipo-protein 
0mp3 of 12 kDa. The Omp2/Omp3 and MOMP proteins are present 
in COMC from all Chlamydia species, and these genes have been 
cloned from both C. trachomatis, C. psittaci and C. 
pneumoniae. However, the gene encoding 9 8 kDa protein from C. 

3 0 pneumoniae COMC have not been characterized or cloned. 

The current state of C. pneumoniae serology and detection 

C. pneumoniae is an obligate intra-cellular bacteria 
belonging to the genus Chlamydia which can be divided into 
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four species: C. trachomatis, c. pneumoniae, C. psittaci and 
C.pecorum. Common for the four species is their obligate 
intra cellular growth, and that they have a biphasic life 
cycle, with an extracellular infectious particle (the 
elementary body, EB) , and an intercellular replicating form 
(the reticulate body, RB) . In addition the Chlamydia species 
are characterized by a common lipopolysaccharide (LPS) 
epitope that is highly immunogenic in human infection. C. 
trachomatis is causing the human ocular infection (trachoma) 
and genital infections. C. psittaci is a variable group of 
animal pathogens where the avian strains can occasionally 
infect humans and give rise to a severe pneumonia 
(ornithosis). The first' C. pneumoniae isolate was obtained 
from an eye infection, but it was classified as a non-typable 
Chlamydia. Under an epidemic outbreak of pneumonia in Finland 
it was realized that the patients had a positive reaction in 
the Chlamydia genus specific test, (the lygranum test) , and 
the patients showed a titre increase to the untyped Chlamydia 
isolates. Similar isolates were obtained in an outbreak of 
upper respiratory tract infections in Seattle, and the 
Chlamydia isolates were classified as a new species, 
Chlamydia pneumoniae (Grayston et al . (1989)) . In addition, 
C. pneumoniae is suggested to be involved in the development 
of atherosclerotic lesions and for initiating bronchial 
asthma (Kuo et al . (1995)) . These two conditions are thought 
to be caused by either chronic infections, by a 
hypersensitivity reaction, or both. 

Diagnosis of Chlamydia pneumoniae infections 

Diagnosis of acute respiratory tract infection with C. 
pneumoniae is difficult. Cultivation of C. pneumoniae from 
patient samples is insensitive, even when proper tissue 
culture cells are selected for the isolation. A C. pneumoniae 
specific polymerase chain reaction (PCR) has been developed 
by Campbell et al . (1992) . 
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Even though Chlamydia pneumoniae has in several studies been 
detected by this PCR it is debated whether this method is 
suitable for detection under all clinical situations. The 
reason for this is, that the cells carrying Chlamydia 
pneumoniae in acute respiratory infections have not been 
determined, and that a chronic carrier state is expected but 
it is unknown in which organs and cells they are present. 
Furthermore, the PCR test is difficult to perform due to the 
low yield of these bacteria and due to the presence of 
inhibitory substances in the patient samples. Therefore, it 
will be of great value to develop sensitive and specific 
sere - diagnostics -for de -tee-ting— both -acute and chronic 
infections. Sero-diagnosis of Chlamydia infections is 
currently based on either genus specific tests as the 
Lygranum test and ELISA, measuring the antibodies to LPS, or 
the. more species specific tests where antibodies to purified 
EBs are measured by microimmuno fluorescence (Micro-IF) (Wang 
et al. (1970)) . However, the micro-IF method is read by 
microscopy, and in order to ensure correct readings the 
result must be compared to the results with C. trachomatis 
used as antigen due to the cross -reacting antibodies to the 
common LPS epitope. Thus, there exists in the art an urgent 
need for development of reliable methods for species specific 
diagnosis of Chlamydia pneumoniae, as has been expressed in 
Kuo et al. (1995); M ..a rapid reliable laboratory test of 
infection for the clinical laboratory is a major need in the 
field". Furthermore, the possible involvement of C. 
pneumoniae in atherosclerosis and bronchial asthma clearly 
warrants the development of an effective vaccine. 

DETAILED DISCLOSURE OF THE INVENTION 

The present invention aims at providing means for efficient 
diagnosis of infections with Chlamydia pneumoniae as well as 
the development of effective vaccines against infection with 
this microorganism. The invention thus relates to species 
specific diagnostic tests for infection in a mammal, such as 
a human, with Chlamydia pneumoniae, said tests being based on 
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the detection of antibodies against surface exposed membrane 
proteins of a size of approximately 89-101 kDa and of 56-57 
kDa, preferably of about 89.6-100.3 kDa and about 56.1 kDa 
(the range in size of the deduced amino acid sequences was 
from 100.3 to 89.6 except for Ompl3 with the size of 56.1 
kDa), or the detection of nucleic acid fragments encoding 
such proteins or variants or subsequences thereof. The 
invention further relates to the amino acid sequences of 
proteins according to the invention, to variants and 
subsequences thereof, and to nucleic acid fragments encoding 
these proteins or variants or subsequences thereof. The 
present invention further relates to antibodies against 
proteins according to the invention. The invention also 
relates to the use of nucleic acid fragments and proteins 
according to the invention in diagnosis of Chlamydia 
pneumoniae and vaccines against Chlamydia pneumoniae. 

Prior to the disclosure of the present invention only a very 
limited number of genes from C. pneumoniae had been 
sequenced. These were primarily the genes encoding known C. 
trachomatis homologues : MOMP, Omp2, 0mp3 , Kdo- transferase , 
the heat shock protein genes GroEl/Es and DnaK, a 
ribonuclease P homologue and a gene encoding a 76 kDa protein 
of unknown function. The reason why so few genes have been 
cloned to date is the very low yield of C. pneumoniae which 
can be obtained after purification from the host cells. After 
such purification the DNA must be purified from the EBs, and 
at this step the C. pneumoniae DNA can easily be contaminated 
with host cell DNA. In addition to these inherent 
difficulties, it is exceedingly difficult to cultivate C. 
pneumoniae and use DNA technology to produce expression 
libraries with very low amounts (few M g) of DNA. It has been 
known since 1993 (Melgosa et al . , 1993) that a 98 kDa protein 
is present in OMC from C. pneumoniae. Even though the protein 
bands of 98 kDa was mentioned to be part of the OMC of C. 
pneumoniae by Melgosa, the gene sequences and thus the 
deduced amino acid sequences have not been determined. Only 
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bands originating from Chlamydia pneumoniae proteins in 
general separated by SDS-PAGE are describe therein. 
However, the gene encoding this protein has not been 
determined before the present invention. Only a very weak or 
5 no reaction with patient sera can be observed to the 98 kDa 
protein (Campbell et al . 1990) and prior to the work of the 
present inventors it has not been recognized that the 89-101 
kDa proteins are surface exposed or that they in fact is 
immunogenic. In this report it is described that a number of 
10 human serum samples reacts with a C. pneumoniae protein that 
in SDS-PAGE migrate as 98 kDa. The protein was not further 
cha-raete-ri-zed and- i-t i-s there-fore not i-n -eGn-f-l-i-e-fe -w-i-t-h the 
present application . 

Halme et al . (1997) described the presence of human T-cell 
15 epitopes in C. pneumoniae proteins of 92-98 kDa. The proteins 
were eluted from SDS-PAGE of total chlamydia proteins but the 
identity of the proteins were not determined. 

Use of antibodies to screen expression libraries is a well 
known method to clone fragments of genes encoding antigenic 
20 parts of proteins. However, since patient sera do not show a 
significant reaction with the 98 kDa protein it has not been 
possible to use patient serum to clone the proteins. 

It was known that monoclonal antibodies generated by the 
25 inventors reacted with conformational epitopes on the surface 
of C. pneumoniae and that they also reacted with C. 
pneumoniae OMC by immuno- electron microscopy (Christiansen et 
al. 1994). Furthermore, the 98 kDa protein is the only 
unknown protein from the C. pneumoniae OMC (Melgosa et al . 
3 0 199 3) . The present inventors chose to take an unconventional 
step in order to clone the gene encoding the hitherto unknown 
9 8 kDa protein: C. pneumoniae OMC was purified and the highly 
immunogenic conformational epitopes were destroyed by SDS- 
treatment of the antigen before immunization. Thereby an 
3 5 antibody (PAB 15 0) to less immunogenic linear epitopes was 
obtained. This provided the possibility to obtain an 
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antiserum which could detect the protein, and it was shown 
that a gene family encoding the 89-101 kDa and 56 proteins 
according to the invention could be detected in colony 
blotting of recombinant E. coll. 

Mice infected with C. pneumoniae generate antibodies to the 
proteins identified by the inventors and named Omp4-15, but 
do not recognize the SDS treated heat denatured antigens 
normally used for SDS -PAGE and immunoblotting . However, a 
strong reaction was seen if the antigen was not heat 
denatured. It is therefore highly likely that if a similar 
reaction is seen in connection with human infections the 
antigens of the present invention will be of invaluable use 
in sero-diagnostic tests and may very likely be used as a 
vaccine for the prevention of infections. 

By generating antibodies against COMC from C. pneumoniae a 
polyclonal antibody (PAB 15 0) was obtained which reacted with 
all the proteins. This antibody was used to identify the 
genes encoding the 89.6-101.3 kDa and 56.1 kDa proteins in an 
expression library of C. pneumoniae DNA. A problem in 
connection with the present invention was that a family 
comprising a number of similar genes were found in c. 
pneumoniae. Therefore, a large number of different clones 
were required to identify clusters of fragments. Only because 
the rabbit antibody generated by the use of SDS -denatured 
antigens contained antibodies to a high number of different 
epitopes positioned on different members of the protein 
family did the inventors succeed in cloning and sequencing 
four of the genes. One gene was fully sequenced, a second was 
sequenced except for the distal part and shorter fragments of 
two additional genes were obtained by this procedure. To 
obtain the DNA sequence of the additional genes and to search 
for more members of the gene family long range PCR with 
primers derived from the sequenced genes, and primers from 
the genes already published in the database were used. This 
approach gave rise to the detection of additional eight genes 
belonging to this family. The genes were situated in two gene 
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clusters: Ompl2 , 11 , 10 , 5 , 4 , 13 and 14 in one cluster and 
Omp6,7,8,9 and 15 in the second. Full sequence was obtained 
from 0mp4 f 5, 6, 7, 8, 9, 10, 11 and 13, and partial sequence of 
0mpl2,14. Ompl3 was a truncated gene of 1545 nucleotides. The 
5 rest of the full length genes were from 2526 (Omp7) to 2838 
(OmplS) nucleotides. The deduced amino acid sequences 
revealed putative polypeptides of 89.6 to 100.3 kDa, except 
for Ompl3 of 56.1 kDa . Alignment of the deduced amino acid 
sequences showed a maximum identity of 4 9% (Omp5/Omp9) when 
10 all the sequences were compared. Except for Ompl3 , the lowest 
homology was to 0mp7 with no more than 34% identity to any of 
.the .other .amino .acid -sequences . The -scores for -Omp 13 was from 
29-32% to all the other sequences. 

In the present context SEQ ID Nos . 1 and 2 correspond to 
15 Omp4 / SEQ ID Nos 3 and 4 correspond to Omp5, SEQ ID Nos 5 and 
6 correspond to Omp6, SEQ ID Nos 7 and 8 correspond to Omp7, 
SEQ ID Nos 9 and 10 correspond to Omp8, SEQ ID Nos 11 and 12 
correspond to Omp9 / SEQ ID Nos 13 and 14 corresponds to 
OmplO, SEQ ID Nos 15 and 16 corresponds to Ompll, SEQ ID Nos 
20 17 and 18 corresponds to Ompl2, SEQ ID Nos 19 and 20 

corresponds to Ompl3, SEQ ID Nos 21 and 22 corresponds to 
Ompl4, and SEQ ID Nos 23 and 24 corresponds to OmplS. 

The estimated size of the Omp proteins of the of the present 
invention are listed in the following. Omp 4 has a size of 

25 98.9 kDa, Omp5 has an estimated size of 97.2 kDa, 0mp6 has an 
estimated size of 100.3 kDa, Omp7 has an estimated size of 
89.7 kDa, Omp8 has an estimated size of 90.0 kDa, Omp9 has an 
estimated size of 96.7 kDa, OmplO has an estimated size of 
98.4 kDa, Ompll has an estimated size of 97.6 kDa, Ompl3 has 

30 an estimated size of 56.1 kDa, Omp 12 and 14 being partial. 

Furthermore, SEQ ID No 2 5 is a subsequence of SEQ ID No 3 , 
SEQ ID No 26 is a subsequence of SEQ ID No 4 , SEQ ID No 27 is 
a subsequence of SEQ ID No 5, SEQ ID No 2 8 is a subsequence 
of SEQ ID No 6, SEQ ID No 29 is a subsequence of SEQ ID No 7, 
3 5 and SEQ ID No 3 0 is a subsequence of SEQ ID No 8. 
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Part of the omp proteins were expressed as fusion proteins, 
and mice polyclonal monospecific antibodies against the 
proteins were produced. The antibodies reacted with the 
surface of C. pneumoniae in both immunofluorescence and 
immunoelectron microscopy. This shows for the first time that 
the 89-101 kDa and 56-57 kDa protein family in C. pneumoniae 
comprises surface exposed outer membrane proteins. This 
important finding leads to the realization that members of 
the 89-101 kDa and 56-57 kDa C. pneumoniae protein family are 
good candidates for the development of a sero diagnostic test 
for C. pneumoniae, as well as the development of a vaccine 
against infections with C. pneumoniae based on using these 
proteins. Furthermore, the proteins may be used as 
epidemiological markers, and polyclonal monospecific sera 
against the proteins can be used to detect C. pneumoniae in 
human tissue or detect C. pneumoniae isolates in tissue 
culture. Also, the genes encoding the 89-101 kDa and 56-57 
kDa such as the 89.6-100.3 kDa and 56.1 protein family may be 
used for the development of a species specific diagnostic 
test based on nucleic acid detection/amplification. 

The full length Omp4 was cloned into an expression vector 
system that allowed expression of the Omp4 polypeptide. This 
polypeptide was used as antigen for immunization of a rabbit. 
Since the protein was purified under denaturing condition the 
antibody did not react with the native surface of C. 
pneumoniae, but it reacted with a 98 kDa protein in 
immunoblotting where purified C. pneumoniae EB was used as 
antigen. Furthermore, the antibody reacted in paraffin 
embedded sections of lung tissue from experimentally infected 
mice. 

A broad aspect of the present invention relates to a species 
specific diagnostic test for infection of a mammal, such as a 
human, with Chlamydia pneumoniae, said test comprising 
detecting in a patient or preferable in a patient sample the 
presence of antibodies against proteins from the outer 
membrane of Chlamydia pneumoniae, said proteins being of a 
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molecular weight of 89-101 kDa or 56-57 kDa, or detecting the 
presence of nucleic acid fragments encoding said outer 
membrane proteins or fragments thereof . 

5 In the context of the present application, the term "patient 
sample' 1 should be taken to mean an amount of serum from a 
patient, such as a human patient, or an amount of plasma from 
said patient, or an amount of mucosa from said patient, or an 
amount of tissue from said patient, or an amount of 

10 expectorate, forced sputum or a bronchial aspirate, an amount 
of urine from said patient, or an amount of cerebrospinal 

f -l.ui.d__f.rom _said -patient-, .or .an amount _o.f ^atherosclerotic 

lesion from said patient, or an amount of mucosal swaps from 
said patient, or an amount of cells from a tissue culture 

15 originating from said patient, or an amount of material which 
in any way originates from said patient. The in vivo test in 
a human according to the present invention includes a skin 
test known in the art such as an intradermal test, e.g 
similar to a Mantaux test. In certain patients being very 

2 0 sensitive to the test, such as is often the case with 
children, he test could be non- invasive , such as a 
superficial test on the skin, e.g. by use of a plaster 

In the present context, the term 89-101 kDa protein means 
proteins normally present in the outer membrane of Chlamydia 

2 5 pneumoniae, which in SDS-PAGE can be observed as one or more 

bands with an apparent molecular weight substantially in the 
range of 89-101 kDa. From the deduced amino acid sequences 
the molecular size varies from 89.6 to 100.3 kDa. 

Within the scope of the present invention are species 

3 0 specific sero-diagnostic tests based on the usage of the 

genes belonging to the gene family disclosed in the present 
application. 

Preferred embodiments of the present invention relate to 
species specific diagnostic tests according to the invention, 
3 5 wherein the outer membrane proteins have sequences selected 
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from the group consisting of SEQ ID NO: 2, SEQ ID NO: 4, SEQ 
ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID 
NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO: 20, SEQ ID 
NO: 22, and SEQ ID NO: 24. 

When used in connection with proteins according to the 
present invention the term "variant" should be understood as 
a sequence of amino acids which shows a sequence similarity 
of less than 100% to one of the proteins of the invention. A 
variant sequence can be of the same size or it can be of a 
different size as the sequence it is compared to. A variant 
will typically show a sequence similarity of preferably at 
least 50%, preferably at least 60%, more preferably at least 
70%, such as at least 80%, e.g. at least 90%, 95% or 98%. 

The term "sequence similarity" in connection with sequences 
of proteins of the invention means the percentage of 
identical and conservatively changed amino acid residues 
(with respect to both position and type) in the proteins of 
the invention and an aligned protein of equal of different 
length. The term "sequence identity" in connection with 
sequences of proteins of the invention means the percentage 
of identical amino acid with respect to both position and 
type in the proteins of the invention and an aligned protein 
of equal of different length. 

Within the scope of the present invention are subsequences of 
one of the proteins of the invention, meaning a consecutive 
stretch of amino acid residues taken from SEQ ID NO: 2, SEQ 
ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID 
NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID 
NO: 20, SEQ ID NO: 22 , or SEQ ID NO: 24. A subsequence will 
■typically comprise at least 100 amino acids, preferably at 
least 80 amino acids, more preferably at least 70 amino 
acids, such as 50 amino acids. It might even be as small as 
10-50 amino acids, such as 2 0-40 amino acids, e.g. about 30 
amino acids. A subsequence will typically show a sequence 
homology of at least 50%, preferably at least 60%, more 
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preferably at least 70%, such as at least 80%, e.g. at least 
90%, 95% or 98%. 

Diagnostic tests according to the invention include 
immunoassays selected from the group consisting of a direct 
or indirect EIA such as an ELISA, an immunoblot technique 
such as a Western blot, a radio immuno assay, and any other 
non-enzyme linked antibody binding assay or procedure such as 
a fluorescence, agglutination or precipitation reaction, and 
nephelometry . 

.A._p ref erred .embodiments .of the .present .invention .relates to. 
species specific diagnostic tests according to the invention, 
said test comprising an ELISA, wherein antibodies against the 
proteins of the invention or fragments thereof are detected 
in samples. 

A preferred embodiment of the invention, is an ELISA based on 
detection in samples of antibodies against proteins of the 
invention. The ELISA may use proteins of the invention, or 
variants thereof, i.e. the antigen, as coating agent. An 
ELISA will typically be developed according to standard 
methods well known in the art, such as methods described in 
"Antibodies; a laboratory manual", Ed. David Lane Harlow, 
Cold Spring Habor laboratories (1988) , which is hereby 
incorporated by reference. 

Recombinant proteins will be produced using DNA sequences 
obtained essentially using methods described in the examples 
below. Such DNA sequences, comprising the entire coding 
region of each gene in the gene family of the invention, will 
be cloned into an expression vector from which the deduced 
protein sequence can be purified. The purified proteins will 
be analyzed for reactivity in ELISA using both monoclonal and 
polyclonal antibodies as well as sera from experimentally 
infected mice and human patient sera. 
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From the experimentally infected mice sera it is known that 
non-linear epitopes are recognized predominantly. Thus, it is 
contemplated that different forms of purification schemes 
known in the art will be used to analyze for the presence of 
discontinuous epitopes, and to analyze whether the human 
immune response is also directed against such epitopes. 

Preferred embodiments of the present invention relate to 
species specific diagnostic tests according to the invention, 
wherein the nucleic acid fragments have sequences selected 
from the group consisting of SEQ ID NO: 1, SEQ ID NO: 3, SEQ 
ID NO: 5, SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 11, SEQ ID 
NO: 13, SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, SEQ ID 
NO: 21, and SEQ ID NO: 23. 

In connection with nucleic acid fragments according to the 
present invention the term "variant" should be understood as 
a sequence of nucleic acids which shows a sequence homology 
of less than 100%. A variant sequence can be of the same size 
or it can be of a different size as the sequence it is 
compared to. A variant will typically show a sequence 
homology of at least 50%, preferably at least 60%, more 
preferably at least 70%, such as at least 80%, e.g. at least 
90%, 95% or 98%. 

The term "sequence homology" in connection with nucleic acid 
fragments of the invention means the percentage of matching 
nucleic acids (with respect to both position and type) in the 
nucleic acid fragments of the invention and an aligned 
nucleic acid fragment of equal or different length. 

In order to obtain information concerning the general 
distribution of each of the genes according to the present 
invention, PCR will be performed for each gene on all 
available C. pneumoniae isolates. This will provide 
information on the general variability of the genes or 
nucleic acid fragments of the invention. Variable regions 
ill be sequenced. From patient samples PCR will be used to 
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amplify variable parts of the genes for epidemiology. Non- 
variable parts will be used for amplification by PCR and 
analyzed for possible use as a diagnostic test. It is 
contemplated that if variability is discovered, PCR of 
5 variable regions can be used for epidemiology. PCR of non- 
variable regions can be used as a species specific diagnostic 
test. Using genes encoding proteins known to be invariable in 
all known isolates prepared as targets for PCR to genes 
encoding proteins with unknown function. 

10 Particularly preferred embodiments of the present invention, 
- - -relate ^to -diagnostic tests ^aGGGrd-i-ng -tQ -the -i-nven-t-i-Gn, 

wherein detection of nucleic acid fragments is obtained by 
using nucleic acid amplification, preferably polymerase chain 
reaction (PCR) . 

15 Within the scope of the present invention is a PCR based test 
directed at detecting nucleic acid fragments of the invention 
or variants thereof. A PCR test will typically be developed ... 
according to methods well known in the art and will typically 
comprise a PCR test capable of detecting and differentiating 

20 between nucleic acid fragments of the invention. Preferred 
are quantitative competitive PCR tests or nested PCR tests. 
The PCR test according to the invention will typically be 
developed according to methods described in detail in EP B 
540 588, EP A 586 112, EP A 643 140 OR EP A 669 401, which 

25 are hereby incorporated by reference. 

Within the scope of the present invention are variants and 
subsequences of one of the nucleic acid fragments of the 
invention, meaning a consecutive stretch of nucleic acids 
taken from SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5, SEQ ID 

3 0 NO: 7, SEQ ID NO: 9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 
15, SEQ ID NO: 19, SEQ ID NO: 21, or SEQ ID NO : 23. A variant 
or subsequence will preferably comprise at least 100 nucleic 
acids, preferably at least 80 nucleic acids, more preferably 
at least 70 nucleic acids, such as at least 50 nucleic acids. 

3 5 It might even be as small as 10-50 nucleic acids, such as 
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20-40 nucleic acids, e.g. about 30 nucleic acids. A 
subsequence will typically show a sequence homology of at 
ieast 3 0%, preferably at least 60%, more preferably at least 
70%, such as at least 80%, e.g. at least 90%, 95% or 98%. The 
shorter the subsequence, the higher the required homology. 
Accordingly, a subsequence of 100 nucleic acids or lower must 
show a homology of at least 80%. 

A very important aspect of the present invention relates to 
proteins of the invention derived from Chlamydia pneumoniae 
having amino acid sequences selected from the group 
consisting of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ 
ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID 
NO: 16, SEQ ID NO: 18, SEQ ID NO: 20, SEQ ID NO: 22, and SEQ 
ID NO: 24 having a sequence similarity of at least 50%, 
preferably at least 6 0%, more preferably at least 70%, such 
as at least 80%, e.g. at least 90%, 95% or 98% and a similar 
biological function. 

By the term "similar biological function" is meant that the 
protein shows characteristics similar with the proteins 
derivable from the membrane proteins of Chlamydia pneumoniae. 
Such proteins comprise repeated motifs of GGAI (at least 2, 
preferable at least 3 repeats) and/or conserved positions of 
tryptophan, (w) . 

Comparison of the DNA sequences from genes encoding Omp4-l5 
shows that the overall similarity between the individual 
genes ranges between 43-55%. Comparison of the amino acid 
sequences of Omp4-15 shows 34-49% identity and 53-64% 
similarity. The homology is generally scattered along the 
entire length of the deduced amino acids. However, as seen 
from figure 8 A - J there are some regions in which the 
homology is more pronounced. This is seen in the repeated 
sequence where the sequence GGAI is repeated 4-7 times in the 
genes. It is interesting that the DNA homology is not 
conserved for the sequences encoding the four amino acids 
GGAI. This may indicate a functional role of this part of the 
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protein and indicates that the repeated structure did not 
occur by a duplication of the gene. In addition to the four 
amino acid repeats GGAI a region from amino acid 400 to 490 
has a higher degree of homology than the rest of the protein, 
5 with the conserved sequence FYDPI occurring in all sequences. 
As further indication of similarity in function the amino 
acid tryptophan (W) is perfectly conserved at 4-6 
localizations in the C- terminal part of the protein. 

Since none of the genes and deduced amino acid sequences of 
10 the invention are identical the following is within the scope 
of the present^ inventiorr; -production oi monospecif ic 
antibodies, the use of said antibodies for characterizing 
which C. pneumoniae proteins are expressed, the use of said 
antibodies for characterizing at which time during 
15 developmental life cycle said C. pneumoniae proteins are 

expressed, and the use of said antibodies for characterizing 
the precise cellular localization of said C. pneumoniae 
proteins. Also within the scope of the present invention is 
the use of monospecific antibodies against proteins of the 
2 0 invention for determining which part of said proteins is 

surface exposed and how proteins in the C. pneumoniae COMC 
interact with each other. 

Preferred embodiments of the present invention relate to 

2 5 polypeptides which comprise subsequences of the proteins of 

the invention, said subsequences comprising the sequence 
GGAI. Further preferred embodiments of the present invention 
relate to polypeptides which comprise subsequences of the 
proteins of the invention, said subsequences comprising the 

3 0 sequence FSGE . 

Polypeptides according to the invention will typically be of 
a length of at least 6 amino acids, preferably at least 15 
amino acids, preferably at least 2 0 amino acids, preferably 
at least 25 amino acids, preferably at least 3 0 amino acids, 
3 5 preferably at least 3 5 amino acids, preferably at least 40 
amino acids, preferably at least 45 amino acids, preferably 
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at least 50 amino acids, preferably at least 55 amino acids, 
preferably at least 100 amino acids. 

A very important aspect of the present invention relates to 
nucleic acid fragments of the invention derived from 
Chlamydia pneumoniae, variants and subsequences thereof. 

Another important aspect of the present invention relates to 
antibodies against the proteins according to the invention, 
such antibodies including polyclonal monospecific antibodies 
and monoclonal antibodies against proteins with sequences 
selected from the group consisting of SEQ ID NO: 2, SEQ ID 
NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 
12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO: 
20, SEQ ID NO: 22, and SEQ ID NO: 24. 

A very important aspect of the present invention relates to 
diagnostic kits for the diagnosis of infection of a mammal, 
such as a human, with Chlamydia pneumoniae, said kits 
comprising one or more proteins with amino acid sequences 
selected from the group consisting of SEQ ID NO: 2, SEQ ID 
NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 
12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO: 
20, SEQ ID NO: 22, and SEQ ID NO: 24. 

Another very important aspect of the present invention 
relates to diagnostic kits for the diagnosis of infection of 
a mammal, such as a human, with Chlamydia pneumoniae, said 
kits comprising antibodies against a protein with an amino 
acid sequence selected from the group consisting of SEQ ID 
NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 
10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 
18, SEQ ID NO: 20, SEQ ID NO: 22, and SEQ ID NO: 24. 
Antibodies included in a diagnostic kit according to the 
invention can be polyclonal or monoclonal or a mixture 
hereof . 



SUBSTITUTE SHEET (RULE 26) 



WO 98/58953 PCT/DK98/00266 

17 

Still another very important aspect of the present invention 
relates to diagnostic kits for the diagnosis of infection of 
a mammal, such as a human, with Chlamydia pneumoniae, said 
kits comprising one or more nucleic acid fragments with 
5 sequences selected from the group consisting of SEQ ID NO: 1, 
SEQ ID NO: 3, SEQ ID NO : 5, SEQ ID NO : 7, SEQ ID NO: 9, SEQ 
ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 17, SEQ 
ID NO: 19, SEQ ID NO: 21, and SEQ ID NO: 23. 

An aspect of the present invention relates to a composition 
for immunizing a mammal, such as a human, against Chlamydia 
pneumoniae , said compos i-tion comprising -one or more proteins 
with amino acid sequences selected from the group consisting 
of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO : 8, 
SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, 
SEQ ID NO: 18, SEQ ID NO: 20, SEQ ID NO: 22, and SEQ ID NO: 
24 . 

An important role for the proteins of the invention in 
prevention of infection of a mammal, such as a human, with C. 
pneumoniae is expected. Thus proteins of the invention, 
including variants and subsequences will be produced, 
typically by using recombinant techniques, and will then be 
used as an antigen in immunization of mammals, such as 
rabbits. Subsequently, the hyper immune sera obtained by the 
immunization will be analyzed for protection against C. 
pneumoniae infection using a tissue culture assay. In 
addition it is contemplated that monoclonal antibodies will 
be produced, typically using standard hybridoma techniques, 
and analyzed for protection against infection with C. 
pneumoniae . 

3 0 It is envisioned that particularly interesting and 

immunogenic epitopes will be found in connection with the 
proteins of the invention, which will comprise subsequences 
of said proteins. It is preferred to use polypeptides 
comprising such subsequences of the proteins of the invention 
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in immunizing a mammal, such as a human, against Chlamydia 
pneumoniae . 

An important aspect of the present invention relates to the 
use of proteins with sequences selected from the group 
consisting of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ 
ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID 
NO: 16, SEQ ID NO: 18, SEQ ID NO: 20, SEQ ID NO: 22, and SEQ 
ID NO: 24 in diagnosis of infection of a mammal, such as a 
human, with Chlamydia pneumoniae. 

A preferred embodiment of the present invention relates to 
the use of proteins according to the invention in an 
undenatured form, in diagnosis of infection of a mammal, such 
as a human, with Chlamydia pneumoniae. 

A very important aspect of the present invention relates to 
the use of proteins with sequences selected from the group 
consisting of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ 
ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID 
NO: 16, SEQ ID NO: 18, SEQ ID NO: 20, SEQ ID NO: 22, and SEQ 
ID NO: 24, for immunizing a mammal, such as a human, against 
Chlamydia pneumoniae. 

A preferred embodiment of the present invention relates to 
the use of proteins according to the invention in an 
undenatured form, for immunizing a mammal, such as a human, 
against Chlamydia pneumoniae. 

A very important aspect of the present invention relates to 
the use of nucleic acid fragments with nucleotide sequences 
selected from the group consisting of SEQ ID NO: 1, SEQ ID 
NO: 3, SEQ ID NO: 5, SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 
11, SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 
19, SEQ ID NO: 21, and SEQ ID NO: 23 for immunizing a mammal, 
such as a human, against Chlamydia pneumoniae. 



SUBSTITUTE SHEET (RULE 26) 



WO 98/58953 PCT7DK98/00266 

IS 

It is envisioned that one type of vaccine against C. 
pneumoniae will be developed by using gene-gun vaccination of 
mice. Typically, different genetic constructs containing 
nucleic acid fragments, combinations of nucleic acid 
5 fragments according to the invention will be used in the 
gene -gun approach. The mice will then subsequently be 
analyzed for production of both humoral and cellular immune 
response and for protection against infection with C. 
pneumoniae after challenge herewith. 

10 In line with this, the invention also relates to the uses of 

-the -PrO-teinS Of -t ^ e -irnr^n.t" H i^in ^ C „a._ioHa.r:man(Sii+-ji /?aj ( o Trann-lnp\ 
L. — " — - — «— * £^ in A w w w w \~<-* — v trx"i"*C7 

as well as to the uses thereof for the preparation of a 
vaccine against infections with Chlamydia pneumoniae. 

Preparation of vaccines which contain protein sequences as 

15 active ingredients is generally well understood in the art, 
as exemplified by U.S. Patents 4,608,251; 4,601,903; 
4,599,231; 4,599,230; 4,596,792; and 4,578,770, all incorpor- 
ated herein by reference. Typically, such vaccines are pre- 
pared as injectables either as liquid solutions or suspen- 

20 sions; solid forms suitable for solution in, or suspension 
in, liquid prior to injection may also be prepared. The 
preparation may also be emulsified. The active immunogenic 
ingredient is often mixed with excipients which are pharma- 
ceutically acceptable and compatible with the active ingredi- 

25 ent . Suitable excipients are, for example, water, saline, 
dextrose, glycerol, ethanol, or the like, and combinations 
thereof. In addition, if desired, the vaccine may contain 
minor amounts of auxiliary substances such as wetting or 
emulsifying agents, pH buffering agents, or adjuvants which 

3 0 enhance the effectiveness of the vaccines. 

The vaccines are conventionally administered parenterally , by 
injection, for example, either subcutaneously or intramuscu- 
larly. Additional formulations which are suitable for other 
modes of administration include suppositories and, in some 
3 5 cases, oral formulations. These compositions take the form of 
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solutions, suspensions, tablets, pills, capsules, sustained 
release formulations or powders and contain 10-95% of active 
ingredient, preferably 25-70%, and optionally a suitable 
carrier. 

The protein sequences may be formulated into the vaccine as 
neutral or salt forms known in the art. The vaccines are 
administered in a manner compatible with the dosage 
formulation, and in such amount as will be therapeutically 
effective and immunogenic. The quantity to be administered 
depends on the subject to be treated. Suitable dosage ranges 
are of the order of several hundred micrograms active 
ingredient per vaccination with a preferred range from about 
0.1 fig to 1000 fig. The immune response may be enhanced if the 
vaccine further comprises an adjuvant substance as known in 
the art. Other possibilities involve the use of 
immunomodulating substances such as lymphokines (e.g. IFN-7, 
IL-2 and IL-12) or synthetic iFN-y inducers such as poly I:C 
in combination with the above-mentioned adjuvants. 

It is also possible to produce a living vaccine by introdu- 
cing, into a non -pathogenic microorganism, at least one 
nucleic acid fragment encoding a protein fragment or protein 
of the invention, and effecting expression of the protein 
fragment or the protein on the surface of the microorganism 
(e.g. in the form of a fusion protein including a membrane 
anchoring part or in the form of a slightly modified protein 
or protein fragment carrying a lipidation signal which allows 
anchoring in the membrane) . The skilled person will know how 
to adapt relevant expression systems for this purpose. 

Another part of the invention is based on the fact that 
recent research have revealed that a DNA fragment cloned in a 
vector which is non-replicative in eukaryotic cells may be 
introduced into an animal (including a human being) by e.g. 
intramuscular injection or percutaneous administration (the 
so-called "gene gun" approach). The DNA is taken up by e.g. 
muscle cells and the gene of interest is expressed by a 
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promoter which is functioning in eukaryotes, e.g. a viral 
promoter, and the gene product thereafter stimulates the 
immune system. These newly discovered methods are reviewed in 
Ulmer et al . , 1993, which hereby is included by reference. 

5 Thus, a nucleic acid fragment encoding a protein or protein 
of the invention may be used for effecting in vivo expression 
of antigens, i.e. the nucleic acid fragments may be used in 
so-called DNA vaccines. Hence, the invention also relates to 
a vaccine comprising a nucleic acid fragment encoding a 

10 protein fragment or a protein of the invention, the vaccine 
- -e-f-f- ec tMTig- -i-n ---vi-ve -ex-pressi-GFr of -antigen -by -an -mairunai-r -such as— 
a human, to whom the vaccine has been administered, the 
amount of expressed antigen being effective to confer 
substantially increased resistance to infections with 

15 Chlamydia pneumoniae in an mammal, such as a human. 

The efficacy of such a 11 DNA vaccine 11 can possibly be enhanced 
by administering the gene encoding the expression product 
together with a DNA fragment encoding a protein which has the 
capability of modulating an immune response. For instance, a 

20 gene encoding lymphokine precursors or lymphokines (e.g. IFN- 
7, IL-2, or IL-12) could be administered together with the 
gene encoding the immunogenic protein fragment or protein, 
either by administering two separate DNA fragments or by 
administering both DNA fragments included in the same vector. 

25 It is also a possibility to administer DNA fragments compri- 
sing a multitude of nucleotide sequences which each encode 
relevant epitopes of the protein fragments and proteins 
disclosed herein so as to effect a continuous sensitization 
of the immune system with a broad spectrum of these epitopes. 

3 0 The following experimental non- limiting examples are intended 
to illustrate certain features and embodiments of the inven- 
tion . 
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LEGENDS TO FIGURES 



Figure 1. The figure shows electron microscopy of negative 
stained purified C. pneumoniae EB (A) and purified OMC (B) . 

Figure 2. The figure shows silver stained 15% SDS-PAGE of 
purified EB and OMC. Lane 1, purified C. pneumoniae EB; lane 
2, C. pneumoniae OMC; lane 3, purified C. trachomatis EB; and 
lane 4 C. trachomatis OMC. 

Figure 3 . The figure shows immunoblotting of C. pneumoniae EB 
separated by 10% SDS-PAGE , transferred to nitrocellulose and 
reacted with rabbit anti C. pneumoniae OMC. 

Figure 4. The figure shows coomassie blue stained 7.5% 
SDS-PAGE of recombinant pEX that were detected by the rabbit 
anti C. pneumoniae serum. Arrow indicated the localization of 
the 117 kDa b-galactosidase protein. 

Figure 5. The figure shows immunoblotting of recombinant pEX 
colones detected by colony blotting separated by 7.5% 
SDS-PAGE and transferred to nitrocellulose and reacted with 
rabbit anti C. pneumoniae OMC. Lane 1, seablue molecular 
weight standard. Lane 2-6 pEX clones cultivated at 42 °C to 
induce the production of the b-galactosidase fusion proteins. 

Figure 6. The figure shows sequence strategy for Omp4 and 
Omp5. Arrows indicates primers used for sequencing. 

Figure 7. C pneumoniae omp genes. The genes are arranged in 
two clusters, in cluster 1 Ompl2, 11, io, 5, 4, 13, and 14 
are found. In cluster 2 are found Omp6 , 7, 8, 9, and 15. 

Figure 8 A - J. The figure shows alignment of C. pneumoniae 
Omp4-15, using the program pileup in the GCG package. 

Figure 9 . The figure shows immunofluorescence of C. 
pneumoniae infected HeLa, 72 hrs . after infection, reacted 
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with mouse monospecific anti -serum against pEX3-36 fusion 
protein. pEX3-3 6 is a part of the 0mp5 gene. 

Figure 10. The figure shows immunoblotting of C. pneumoniae 
EB, lane 1-3 heated to 100°C in SDS-sample buffer, lane 4-6 
5 unheated. Lane 1 reacted with rabbit anti C. pneumoniae OMC; 
lane 2 and 4 pre -serum; lane 3 and 5 polyclonal rabbit anti 
pEXl-1 fusion protein; lane 6 MAb 26. JL. 

Figure 11. The figure shows immunoblotting of C. pneumoniae 
EB, lane 1-4 heated to lOOoC in SDS-sample buffer, lane 5-6 
_10_ unheated. Reacted with serum from C5 7 -black mice J4_,days 

after infection with 10 7 CFU of C. pneumoniae. Lane 1 and 5 
mouse 1; lane 2 and 6 mouse 2; lane 3 and 5 mouse 3; and lane 
4 and 8 mouse 4 . 

Figure 12. The figure shows immunohi s tochemi s try analysis of 
15 mouse lung tissue with C. pneumoniae inclusions present both 
in the bronchial epithelium and in the lung parenchyma 
(arrows) . 
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EXAMPLE 1 

Cloning of the genes encoding the 98/95 kDa C. pneumoniae 
COMC proteins 

Purification of C. pneumonia EBs and COMC 

C. pneumoniae was cultivated in HeLa cells. Cultivation was 
done according to the specifications of Miyashita and 
Matsumoto (1992), with the modification that centrif ugation 
of supernatant and of the later precipitate and turbid bottom 
layer was carried out at 100,000 X g. The microorganism 
attached to the HeLa cells by 30 minutes of centrif ugation at 
1000 x g, after which the cells were incubated in RPMI 1640 
medium (Gibco BRL, Germany cat No. 51800-27), containing 5% 
foetal calf serum (FCS, Gibco BRL, Germany Cat No. 10106.169) 
gentamicin for two hours at 3 7°C in 5% C02 atmosphere. The 
medium was changed to medium that in addition contained 1 mg 
per ml of cycloheximide . After 4 8 hours of incubation a 
coverslip was removed from the cultures and the inclusion was 
tested with an antibody specific for C. pneumoniae <MAb 26.1) 
(Christiansen et al . 1994) and a monoclonal antibody specific 
for the species C. trachomatis (MAb 32.3, Loke diagnostics, 
Arhus Denmark) to ensure that no contamination with C. 
trachomatis had occurred. The HeLa cells were tested by 
Hoechst stain for Mycoplasma contamination as well as by 
culture in BEa and BEg medium (Freund et al . , 1979). Also the 
C. pneumoniae stocks were also tested for Mycoplasma 
contamination by cultivation in BEa and BEg medium. No 
contamination with C. trachomatis, Mycoplasmas or bacteria 
were detected in cultures or cells. 72 hours post-infection 
the monolayer was washed in PBS , the cells were loosened in 
PBS with a rubber policeman, and the Chlamydia were liberated 
from the host cell by sonication. The C. pneumoniae EBs and 
RBs were purified on discontinuous density gradients 
(Miyashita et al . (1992)) . The purity of the Chlamydia EBs 
were verified by negative staining and electronmicroscopy 
(Figure 1), only particles of a size of 0.3 to 0.5 mm were 
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detected in agreement with the structure of C. pneumonia EBs . 
The purified Chlamydia EBs were subjected to sarkosyl 
extraction as described by Caldwell et al (1981) with the 
modification that a brief sonication was used to suspend the 
5 COMC. The purified COMC was tested by electronmicroscopy and 
negative staining (Figure 1) , where a folded outer membrane 
complex was seen. 

SDS-PAGE analysis of purified EBs and COMC 

The proteins from purified EBs and C. pneumoniae OMC were 
1.0 separated on. 15-% -SDS^polyacrylamide. gel , -and -the -gel was 
silver stained (Figure 2) , in lane 1 it is seen that the 
purified EBs contain major proteins of 100/95 kDa and a 
protein of 38 kDa, in the purified COMC (lane 2) these two 
protein groups are also dominant. In addition, proteins with 
15 a molecular weight of 62/60 kDa, 55 kDa, and 12 kDa have been 
enriched in the COMC preparation. When the purified C. 
pneumoniae EBs are compared to purified C. trachomatis EB 
(lane 3) it is seen that predominant protein in the C. 
trachomatis EB is the major outer membrane protein (MOMP) , 
2 0 and it is also the dominant band in the COMC preparation of 
C. trachomatis (lane 4), and Omp2 of 60/62 kDa as well as 
Omp3 at 12 kDa are seen in the preparation. However, no major 
bands with a size of 100/95 kDa are detected as in the C. 
pneumoniae COMC preparation. 

2 5 Production of rabbit polyclonal antibodies against C. 

pneumoniae COMC 

To ensure production of rabbit antibodies that would 
recognize all the C. pneumoniae proteins in immuno-blotting 
and colony-blotting 10 fxg of COMC antigen was dissolved in 20 

3 0 fil of SDS sample buffer and thereafter divided into 5 vials. 

The dissolved antigen was further diluted in one ml of PBS 
and one ml of Freund incomplete adjuvant (Difco laboratories, 
USA cat. No. 063 9-60-6) and injected into the quadriceps 
muscle of a New Zealand white rabbit. The rabbit was given 
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three times intramuscular injections at an interval of one 
week, and after further three weeks the dissolved COMC 
protein, diluted in one ml PBS was injected intravenously, 
and the procedure was repeated two weeks later. Eleven weeks 
after the beginning of the immunization, the serum was 
obtained from the rabbit. Purified C. pneumoniae EBs were 
separated by SDS-PAGE, and the proteins were 

electrotransf erred to nitrocellulose membrane. The membrane 
was blocked and immunostained with the polyclonal COMC 
antibody (Figure 3) . The serum recognized proteins with a 
size of 100/95, 60 and 38 kDa in the EB preparation. This is 
in agreement with the sizes of the outer membrane proteins. 

Cloning of the COMC proteins 

Due to the cultivation of C. pneumoniae in HeLa cells, 
contaminating host cell DNA could be present in the EB 
preparations. Therefore, the purified EB preparations were 
treated with DNAse to remove contaminating DNA. The C. 
pneumoniae DNA was then purified by CsCl gradient 
centrifugation. The C. pneumoniae DNA was partially digested 
with Sau3A and the fractions containing DNA fragments with a 
size of approx. 0.5 to 4.0 kb were cloned into the expression 
vector system pEX (Boehringer, Germany cat. No. 1034 766, 
1034 774, 1034 782). The pEX vector system has a 
/3-galactosidase gene with multiple cloning sites in the 3 ' end 
of the /3-galactosidase gene. Expression of the gene is 
regulated by the PR promoter, so the protein expression can 
be induced by elevating the temperature from 32 to 42°C. The 
colonies of recombinant bacteria were transferred to 
nitrocellulose membranes, and the temperature was increased 
to 42°C for two hours. The bacteria were lysed by placing the 
nitrocellulose membranes on filters soaked in 5% SDS . The 
colonies expressing outer membrane proteins were detected 
with the polyclonal antibody raised against C. pneumoniae 
COMC. The positive clones were cultivated in suspension and 
induced at 42°C for two hours. The protein profile of the 
clones were analysed by SDS-PAGE, and increases in the size 



SUBSTITUTE SHEET (RULE 25) 



WO 98/58953 PCT/DK98/00266 

27 

of the induced b-galactosidase were observed (Figure 4) . In 
addition, the proteins were electrotransf erred to 
nitrocellulose membranes, and the reaction with the 
polyclonal serum against COMC was confirmed (Figure 5) . 

5 Sequencing of positive COMC clones 

To characterize the pEX clones, the inserted C. pneumoniae 
DNA was sequenced. The resulting DNA sequences were searched 
against the prokaryotic sequences in the GenEmbl database. 
The search identified 6 clones as part of the 0mp2 gene, and 

-1-0- ^ clones- -as part -of the -0mp3 gene-, -and- -2- clones as part of 
the MOMP gene, indicating that COMC proteins had been 
successfully cloned. Furthermore, 32 clones were obtained, 
containing DNA sequences not found in the GenEmbl database . 
These sequences could, however, be clustered in two contics 

15 of 6 and 4 clones, and three clones were identical. In 

addition 19 clones were found with no overlap to the contics 
(Figure 7) . To obtain more sequence data for the genes, C. * 
pneumoniae DNA was totally digested with BamHI restriction - 
enzyme, and the fragments were cloned into the vector 

20 pBluescript. The ligated DNA was electrotransf ormed into E. 
coli XLl-Blue and selected on plates containing Ampicillin. 
The recombinant bacterial colonies were transferred to a 
nitrocellulose membrane, and colony hybridisation was 
performed using the inserts of pEX 1-1 clone as a probe. A 

25 clone containing a single BamHI fragment of 4.5 kb was found, 
and the hybridisation to the probe was confirmed by Southern 
blotting. The insert of the clone was sequenced 
bi-directionally using synthetic primers for approx. each 3 00 
bp. The sequence of the BamHI fragment made it possible to 

30 join the two contics of pEX clones. Totally, together with 
the pEX clones it was possible to assemble 6.5 kb DNA 
sequence, encoding two new COMC proteins. (Figure 6) 

Additional sequences were obtained by PCR performed on 
purified C. pneumoniae DNA with primers both from the known 
3 5 Omp genes and from other known genes. The obtained PCR 
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products were sequenced, The sequence organisation is shown 
in Fig. 7. Additional 8 Omp genes were detected. The 
alignment of the deduced amino acid sequences are shown in 
Fig. 8 A and B. 

Analysis of DNA sequence 

The DNA sequence encoding the Omp4-15 proteins with a size of 
89.6-100.3 kDa (and for Ompl3 : 56.1 kDa) . Omp4 and 0mp5 were 
transcribed in opposite directions . Downstream Omp4 a 
possible termination structure was located. The 3 ' end of the 
0mp5 gene was not cloned due to the presence of the BamHl 
restriction enzyme site positioned within the gene. The 
translated DNA sequence of Omp4 and Omp5 was compared by use 
of the gap programme in the GCG package (Wisconsin package, 
version 8.1 -UNIX, August 1995, sequence analysis software 
package) . The two genes had an amino acid identity of 41% 
(similarity 61%) , and a possible cleavage site for signal 
peptidase 1 was present at amino acid 17 in Omp4 and amino 
acid 25 in 0mp5 . When the amino acid sequence encoded by two 
other pEX clones were compared to the sequence of 0mp4 and 
Omp5 they also had amino acid homology to the genes. It is 
seen that the two clones have homology to the same area in 
the 0mp4 and Omp5 proteins. Consequently, the pEX clones must 
have originated from two additional genes. Therefore these 
genes were named 0mp6 and Omp7 . Similar analyses were 
performed with the other genes. In contrast to what was seen 
for Omp4 and 5 none of the other putative omp proteins had a 
cleavage site for signal peptides. 

EXAMPLE 2 

Polyclonal monospecific antibodies against pEX fusion 
proteins and full length recombination + 0mp4 

To investigate the topology of the Omp4-7 proteins, 
representative pEX clones, were selected from each gene. The 
fusion proteins of /3-galactosidase/omp were induced, and the 
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proteins were partially purified as inclusion bodies. Balb/c 
mice were immunized three times intramuscular with the 
antigens at an interval of one week, and after six weeks the 
serum was obtained from the mice. HeLa cells were infected 
5 with the C. pneumoniae. 72 hours after the infection the 

mono-layers were fixed with 3.7% formaldehyde. This treatment 
makes the outer membrane of the Chlamydia impermeable for 
antibodies due to the extensive cross -linking of the outer 
membrane proteins by the formaldehyde . The HeLa cells were 

10 permeabilized with 0.2% Triton X100, the monolayers were 
washed in PBS , then incubated with 20% (v/v) FCS to 
inactivate free radi c als of the formaldehyde ... The. -mice .sera 
were diluted 1:100 PBS with 20% (v/v) FCS and incubated with 
the monolayers for half an hour. The monolayers were washed 

15 in PBS and secondary FITCH conjugated rabbit anti mouse serum 
was added for half an hour, and the monolayers were washed 
and mounted. Several of the * antibodies reacted strongly with 
the EBs in the inclusions (Figure 9) . In spite of the 
formaldehyde fixation it could not be excluded that the 

20 surface of the EB was changed by the treatments, so that the 
antibodies could get access to the Omp4-7. Therefore, the 
reaction was confirmed by immuno- electron microscopy with the 
antibody raised against clone pEX3-36. Purified EB of C. 
pneumoniae were absorbed to carbon coated nickel grids. After 

25 the absorption the grids were washed with PBS and blocked in 
0.5% Ovalbumin dissolved in PBS. The antibodies were diluted 
1:100 in the same buffer and incubated for 3 0 minutes. The 
grids were washed in PBS. Rabbit anti mouse Ig conjugated 
with lOnm colloidal gold diluted in PBS containing 1% gelatin 

3 0 was added to the grids for half an hour. The grids were 

washed in 3 x PBS with 1% gelatin and 3 times in PBS, the 
grids were contrastained with 0.7% phospho tungstic acid. The 
grids were analysed in a Jeol 1010 electron microscope at 40 
kV. It was seen that the gold particles were covering the 

3 5 surface of the purified EB. Because the C. pneumoniae EBs~ 
were not exposed to any detergent or fixation under either 
the purification or the reaction with antibodies, these 
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results show that the cloned proteins have surface exposed 
epitopes . 

Polyclonal monospecific antibodies against Omp4 

The Omp4 gene was amplified by PCR with primers that 
5 contained LIC-sites, and the PCR product was cloned into the 
pET-3 0 LIC vector (Novagen) . The histidine tagged fusion 
protein was expressed by induction of the synthesis by IPTG 
and purified over a nickel column. The purified Omp4 protein 
was used for immunization of a rabbit (six times, 8 fig each 
10 time) . 

Use of rabbit polyclonal antibodies to recombinant Omp4 for 
detection of Chlamydia pneumoniae in paraffin embedded 
sections 

The lungs of C. pneumoniae infected mice were obtained three 
15 days after intranasal infection. The tissue samples were 

fixed in 4% formaldehyde, paraffin embedded, sectioned and 
deparaf f inized prior to staining. The sections were incubated 
with the rabbit serum diluted 1:200 in TBS ( 15 0 mM NaCl, 
2 0mM Tris pH 7.5) for 3 0 min at room temperature. After wash 
2 0 two times in TBS the sections were incubated with the 

secondary antibody (biotinylated goat ant i -rabbit antibodies) 
diluted 1:300 in TBS, followed by two times wash in TBS. The 
sections were stained with streptavidin-biotin complex 
(streptABComplex/AP, Dako) for 3 0 min washed and developed 

2 5 under microscopic inspection with chromagen + new fuchsin 

(Vector laboratories) . The sections were counter stained with 
Hematoxylin and analyzed ny microscopy. 

Immuno blotting analysis with hyperimmune monospecific rabbit 
anti- serum 

3 0 The insert of pEXl-1 clone was amplified by PCR using primers 

containing LIC sites. The PCR product could therefore be 
inserted in the pET-32 LIC vector (Novagen, UK cat No. 69076- 
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1) . Thereby the insert sequence of the pEXl-1 clone was 
expressed in the new vector as a fusion protein, the part of 
the fusion protein encoded by the pET-32 LIC vector had 6 
histidine residues in a row. The expression of the fusion 
protein was induced in this vector, and the. fusion protein 
could be purified under denaturing condition on a Ni2+ column 
due to the high affinity of the histidine residues to 
divalent cations. The purified protein was used for 
immunization of a New Zealand white rabbit. After 6 times 
intramuscular and 2 times intravenous immunization the serum 
was obtained from the rabbit. Purified C. pneumoniae EB was 
dissolved in SDS- sample buffer. .Half of the .^sample .was heated 
to 100°C in the sample buffer, whereas the other half of the 
sample was not heated. The samples were separated by 
SDS -PAGE, and the proteins were transferred to 
nitrocellulose, the serum was reacted with the strips. With 
the samples heated to 100°C the serum recognized a high 
molecular weight band of approximately 98 kDa . This is in 
agreement with the predicted size of Omp5, of which the 
pEXl-1 clone is a part, however, when the antibody was 
reacted to the strip with unheated EB, the pattern was 
different. Now a band was seen with a size of 75 kDa, in 
addition weaker bands were observed above the band (Figure 
10) . These data demonstrate that Omp5 needs boiling in 
SDS -sample buffer to be fully denatured and migrate with a 
size as predicted from the gene product. When the samples 
were not boiled, the protein was not fully denatured and less 
SDS binds to the protein and it has a more globular structure 
that will migrate faster in the acrylamide gel . The band 
pattern looked identical to what was obtained with a 
monoclonal antibody (MAb 26.1) (lane 6), we earlier have 
described (Christiansen et al . , 1994), reacting with the 
surface of C. pneumoniae EB, but the antibody do not react 
with the fully SDS denatured C. pneumoniae EB in 
immunoblotting . 
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Due to the realization of the altered migration of the Omp4-7 
proteins without boiling, we chose to analyse antibodies 
against C. pneumoniae EBs after an experimental infection of 
5 mice. To obtain antibodies from an infection caused by C. 

pneumoniae, C57 black mice were inoculated intranasally with 
10 7 CFI of C. pneumoniae under a light ether anaesthesia. 
After 14 days of infection the serum samples were obtained 
and the lungs were analysed for pathological changes. In two 

10 of the mice a severe pneumonia was observed in the lung 
sections, and in the third mouse only minor changes were 
observed. The serum from the mice was diluted 1:100 and 
reacted with purified EBs dissolved in sample buffer with and 
without boiling. In the preparations that had been heated to 

15 100 °C the sera from two of the mice reacted strongly with 
bands of 60/62 kDa and weaker bands of 55 kDa, but no 
reaction was observed with proteins of the size of Omp4-7 
(Figure 11) . However, when the sera were reacted with the 
preparation that had not been heated they all had a strong 

2 0 reaction with a broad band of an approximate size of 75 kDa. 

This is in agreement with the size of the Omp4-7 proteins in 
the unheated preparation. Therefore, it could be concluded 
that the epitopes of the 0mp4-7 proteins recognized by the 
antibodies after a C. pneumoniae infection were discontinuous 
25 epitopes because the full denaturation of the antigen 
completely destroyed the epitopes. The 75 kDa protein 
observed in unheated samples is not 0mp2 (Shown in 
immunoblotting with an Omp2 specific antibody) 

EXAMPLE 3 

3 0 Comparison of Omp4-7 of C. pneumoniae with putative outer 

membrane proteins (POMP) of C. psittaci 

Longbottom et al . (1996) have published partial sequence from 
98 to 90 kDa proteins from C. psittaci. They have entered the 
full sequence of 5 genes in this family in the EMBL database. 
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They have named the genes "putative outer membrane proteins" 
(POMP) since their precise location was not determined. The 
family is composed of two genes that are completely 
identical, and two genes with high homology to these genes. 
5 They calculated a molecular size of 9 0 and 91 kDa. The 5th 
encode a protein of 9 8 kDa. The sequence of the Omp4-7 
proteins of C. pneumoniae were compared to the sequences of 
the C, Psittaci POMP proteins with the programme pileup in 
the GCG package. The amino acid homologies were in the range 

10 of 51-63%. It is seen that the C. pneumoniae Omp4-5 proteins 
are most related to the 98 kDa POMP protein of C. psittaci. 
Interesti ngly, the 9 8 kDa C_. _ .psAtXaci .POMP _p.r_o_tje.in... is more 
related to the C. pneumoniae genes than to the other C. 
psittaci genes. The repeated sequences of GGAI were conserved 

15 in the 9 8 kDa POMP protein, but only three GGAI repeats were 
present in the 90 and 91 kDa C. psittaci POMP proteins. For 
C. psittaci it has been shown that antibodies to these 
proteins seem to be protective for the infection. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION 

(i) APPLICANT 

(A) NAME: Svend Birkelund 

(B) STREET: Dept. of Medical Microbiology and Immunology, 

University of Arhus 

(C) CITY: Arhus C 

(D) STATE OR PROVINCE : 

(E) COUNTRY: Denmark 

(F) POSTAL CODE: 8000 

(ii) TITLE OF THE INVENTION: Chlamydia pneumoniae anti 
gens 



(iii) NUMBER OF SEQUENCES: 3 0 

(iv) COMPUTER - READABLE FORM: 

(A) MEDIUM TYPE: Diskette 

(B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: DOS 

(D) SOFTWARE: FastSEQ for Windows Version 2.0 

(v) CURRENT APPLICATION DATA: 
(A) APPLICATION NUMBER: 

(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3200 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY : Coding Sequence 
<B) LOCATION: 205... 2987 
(D) OTHER INFORMATION: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

CAATGT CGAA GAGAGCACTA ACCAGGAAAA TTGCGATTTC ATAAACCCAC TTTATTATTA 6 0 

AATT CTTACT TGCGTCATAT AAAATAGAAA ACTCAGAGAG TCAAGATAAA AATTCTTGAC 12 0 

AG CTGTTTTG T CAT CTTT AA CTTGATTTAC TTATTTTGTT TCTATATTGA TGCGAATAGT 180 

TCTCTAAAAA ACAAAAGCAT TACC ATG AAG ACT TCG ATT CCT TGG GTT TTA 231 

Met Lys Thr Ser lie Pro Trp Val Leu 
1 5 

GTT TCC TCC GTG TTA GCT TTC TCA TGT CAC CTA CAG TCA CTA GCT AAC 27 9 
Val Ser Ser Val Leu Ala Phe Ser Cys His Leu Gin Ser Leu Ala Asn 
10 15 20 25 



SUBSTITUTE SHEET (RULE 2S) 



WO 98/S8953 PCT7DK98/00266 

36 

GAG GAA CTT TTA TCA CCT GAT GAT AGC TTT AAT GGA AAT ATC GAT TCA 32 7 

Glu GIu Leu Leu Ser Pro Asp Asp Ser Phe Asn Gly Asn He Asp Ser 
3 <> 35 40 

GGA ACG TTT ACT CCA AAA ACT TCA GCC ACA ACA TAT TCT CTA ACA GGA 375 
Gly Thr Phe Thr Pro Lys Thr Ser Ala Thr Thr Tyr Ser Leu Thr Gly 
45 50 55 

GAT GTC TTC TTT TAC GAG CCT GGA AAA GGC ACT CCC TTA TCT GAC AGT 42 3 

Asp Val Phe Phe Tyr Glu Pro Gly Lys Gly Thr Pro Leu Ser Asp Ser 
60 65 70 

TGT TTT AAG CAA ACC ACG GAC AAT CTT ACC TTC TTG GGG AAC GGT CAT 471 
Cys Phe Lys Gin Thr Thr Asp Asn Leu Thr Phe Leu Gly Asn Gly His 
75 80 85 

AGC TTA ACG TTT GGC TTT ATA GAT GCT GGC ACT CAT GCA GGT GCT GCT 519 
Ser Leu Thr Phe Gly Phe He Asp Ala Gly Thr His Ala Gly Ala Ala 
90 95 100 105 

GCA TCT ACA ACA GCA AAT AAG AAT CTT ACC TTC TCA GGG TTT TCC TTA ' 567 
Ala Ser Thr Thr Ala Asn Lys Asn Leu Thr Phe Ser Gly Phe Ser Leu 
HO 115 120 

CTG AGT TTT GAT TCC TCT CCT AGC ACA ACG GTT ACT ACA GGT CAG GGA 615 
Leu Ser Phe Asp Ser Ser Pro Ser Thr Thr Val Thr Thr Gly Gin Gly 1 
125 130 135 

ACG CTT TCC TCA GCA GGA GGC GTA AAT TTA GAA AAT ATT CGT AAA CTT 6 63 

Thr Leu Ser Ser Ala Gly Gly Val Asn Leu Glu Asn He Arg Lys Leu 
140 145 150 

GTA GTT GCT GGG AAT TTT TCT ACT GCA GAT GGT GGA GCT ATC AAA GGA 711 
Val Val Ala Gly Asn Phe Ser Thr Ala Asp Gly Gly Ala He Lys Gly 
155 160 165 

GCG TCT TTC CTT TTA ACT GGC ACT TCT GGA GAT GCT CTT TTT AGT AAC 75 9 

Ala Ser Phe Leu Leu Thr Gly Thr Ser Gly Asp Ala Leu Phe Ser Asn 
170 175 180 185 

AAC TCT TCA TCA ACA AAG GGA GGA GCA ATT GCT ACT ACA GCA GGC GCT 80 7 

Asn Ser Ser Ser Thr Lys Gly Gly Ala He Ala Thr Thr Ala Gly Ala 
19 0 195 200 

CGC ATA GCA AAT AAC ACA GGT TAT GTT AGA TTC CTA TCT AAC ATA GCG 855 
Arg He Ala Asn Asn Thr Gly Tyr Val Arg Phe Leu Ser Asn He Ala 
205 210 215 

TCT ACG TCA GGA GGC GCT ATC GAT GAT GAA GGC ACG TCG ATA CTA TCG 903 
Ser Thr Ser Gly Gly Ala He Asp Asp Glu Gly Thr Ser He Leu Ser 
220 225 230 

AAC AAC AAA TTT CTA TAT TTT GAA GGG AAT GCA GCG AAA ACT ACT GGC 951 
Asn Asn Lys Phe Leu Tyr Phe Glu Gly Asn Ala Ala Lys Thr Thr Gly 
235 240 245 

GGT GCG ATC TGC AAC ACC AAG GCG AGT GGA TCT CCT GAA CTG ATA ATC 999 
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Gly Ala lie Cys Asn Thr Lys Ala Ser Gly Ser Pro Glu Leu lie lie 
250 255 260 265 

TCT AAC AAT AAG ACT CTG ATC TTT GCT TCA AAC GTA GCA GAA ACA AGC 104 7 

Ser Asn Asn Lys Thr Leu lie Phe Ala Ser Asn Val Ala Glu Thr Ser 
270 275 280 



GGT GGC GCC 
Gly Gly Ala 



ACA GAG TTT 
Thr Glu Phe 
300 

GCT ATC AGC 
Ala He Ser 

315 

GGA AAC ATT 
Gly Asn He 
330 

GAT ACT CCT 
Asp Thr Pro 



ACG GAA TTA 
Thr Glu Leu 



ATC ACT TCA 
He Thr Ser 
380 

TCT GCG GGA 
Ser Ala Gly 
395 

GAA ACC CTA 
Glu Thr Leu 
410 

TCA TTC ACG 
Ser Phe Thr 



AAG GGA GTC 
Lys Gly Val 



CTC CTC GGC 
Leu Leu Gly 
460 

ATT ACA ATC 
He Thr He 



ATC CAT GCT 
He His Ala 
285 

CTA CGA AAT 
Leu Arg Asn 



ATC GAT GCC 
He Asp Ala 



ACC TTT GTA 
Thr Phe Val 
335 

AAA CGT AAT 
Lys Arg Asn 
350 

CGG GCT GCT 
Arg Ala Ala 
365 

GAA GGA ACC 
Glu Gly Thr 



GCT CTC AAT 
Ala Leu Asn 



ACA GCA GAT 
Thr Ala Asp 
415 

CAG CCA GTC 
Gin Pro Val 
430 

ACT TTA GAG 
Thr Leu Glu 
445 

ATG GAT TCA 
Met Asp Ser 



ACG AAC CTA 
Thr Asn Leu 



AAA AAG CTA 
Lys Lys Leu 
290 

AAT GTC TCA 
Asn Val Ser 
305 

TCA GGA GAG 
Ser Gly Glu 
"32 0 

AGA AAT ACC 
Arg Asn Thr 



GCG ATC AAC 
Ala He Asn 



AAA AAT CAT 
Lys Asn His 
370 

TCA TCA GAC 
Ser Ser Asp 
385 

CCA TAT CAA 
Pro Tyr Gin 
400 

GAA CTT AAA 
Glu Leu Lys 



TCC CTA TCC 
Ser Leu Ser 



AGC ACG AGC 
Ser Thr Ser 
450 

GGA ACG ACA 
Gly Thr Thr 
465 

GGA ATC AAT 
Gly He Asn 



GCC CTT TCC 
Ala Leu Ser 



TCA GCA ACT 
Ser Ala Thr 



CTC AGT CTT 
Leu Ser Leu 
32 5 

CTT ACA ACA 
Leu Thr Thr 
340 

ATA GGA AGT 
He Gly Ser 
355 

ACA ATT TTC 
Thr He Phe 



GTA TTG AAG 
Val Leu Lys 



GGA ACG ATT 
Gly Thr He 
405 

GTT GCT GAC 
Val Ala Asp 
420 

GGA GGA AAG 
Gly Gly Lys 
435 

TTC TCT CAA 
Phe Ser Gin 



TTA TCA ACT 
Leu Ser Thr 



GTT GAC TCC 
Val Asp Ser 



TCT GGA GGC 
Ser Gly Gly 
295 

CCT AAG GGG 
Pro Lys Gly 
310 

TCT GCA GAG 
Ser Ala Glu 



ACC GGA AGT 
Thr Gly Ser 



AAC GGG AAA 
Asn Gly Lys 
360 

TTC TAT GAT 
Phe Tyr Asp 
375 

ATA AAT AAC 
He Asn Asn 
390 

CTA TTT TCT 
Leu Phe Ser 



AAT TTA AAA 
Asn Leu Lys 

TTA TTG CTA 
Leu Leu Leu 
440 

GAG GCC GGT 
Glu Ala Gly 
455 

ACA GCT GGG 
Thr Ala Gly 
470 

TTA GGT CTT 
Leu Gly Leu 



TTT 109 5 
Phe 



GGT 1143 
Gly 



ACA 1191 
Thr 



ACC 123 9 

Thr 

345 

TTC 12 8 7 

Phe 



CCC 133 5 
Pro 



GGC 1383 
Gly 



GGA 1431 
Gly 



TCT 147 9 

Ser 

425 

CAA 152 7 
Gin 



TCT 1575 
Ser 



AGT 1623 
Ser 



AAG 1671 
Lys 
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475 480 



485 



CAG CCC GTC AGC CTA ACA GCA AAA GGT GCT TCA AAT AAA GTG ATC GTA 1719 
Gin Pro Val Ser Leu Thr Ala Lys Gly Ala Ser Asn Lys Val He Val 
490 495 500 505 

TCT GGG AAG CTC AAC CTG ATT GAT ATT GAA GGG AAC ATT TAT GAA AGT 1767 
Ser Gly Lys Leu Asn Leu He Asp He Glu Gly Asn He Tyr Glu Ser 
510 515 520 

CAT ATG TTC AGC CAT GAC CAG CTC TTC TCT CTA TTA AAA ATC ACG GTT 1815 
His Met Phe Ser His Asp Gin Leu Phe Ser Leu Leu Lys He Thr Val 
525 530 535 

GAT GCT GAT GTT GAT ACT AAC GTT GAC ATC AGC AGC CTT ATC CCT GTT 1863 
Asp Ala Asp Val Asp Thr Asn Val Asp He Ser Ser Leu He Pro Val 
540 545 550 

CCT GCT GAG GAT CCT AAT TCA GAA TAC GGA TTC CAA GGA CAA TGG AAT 1911 
Pro Ala Glu Asp Pro Asn Ser Glu Tyr Gly Phe Gin Gly Gin Trp Asn 
555 560 565 

GTT AAT TGG ACT ACG GAT ACA GCT ACA AAT ACA AAA GAG GCC ACG GCA 1959 
Val Asn Trp Thr Thr Asp Thr Ala Thr Asn Thr Lys Glu Ala Thr Ala 
570 575 580 585 

ACT TGG ACC AAA ACA GGA TTT GTT CCC AGC CCC GAA AGA AAA TCT GCG 2 007 
Thr Trp Thr Lys Thr Gly Phe Val Pro Ser Pro Glu Arg Lys Ser Ala 
590 595 600 

TTA GTA TGC AAT ACC CTA TGG GGA GTC TTT ACT GAC ATT CGC TCT CTG 2 055 
Leu Val Cys Asn Thr Leu Trp Gly Val Phe Thr Asp He Arg Ser Leu 
605 610 615 

CAA CAG CTT GTA GAG ATC GGC GCA ACT GGT ATG GAA CAC AAA CAA GGT 2103 
Gin Gin Leu Val Glu He Gly Ala Thr Gly Met Glu His Lys Gin Gly 
620 625 630 



TTC TGG GTT TCC TCC ATG ACG AAC TTC CTG CAT AAG ACT GGA GAT GAA 
Phe Trp Val Ser Ser Met Thr Asn Phe Leu His Lys Thr Gly Asp Glu 
635 640 645 



2151 



AAT CGC AAA GGC TTC CGT CAT ACC TCT GGA GGC TAC GTC ATC GGT GGA 2199 
Asn Arg Lys Gly Phe Arg His Thr Ser Gly Gly Tyr Val He Gly Gly 
650 655 660 665 

AGT GCT CAC ACT CCT AAA GAC GAC CTA TTT ACC TTT GCG TTC TGC CAT 2247 
Ser Ala His Thr Pro Lys Asp Asp Leu Phe Thr Phe Ala Phe Cys His 
67 ° 675 680 

CTC TTT GCT AGA GAC AAA GAT TGT TTT ATC GCT CAC AAC AAC TCT AGA 22 95 
Leu Phe Ala Arg Asp Lys Asp Cys Phe He Ala His Asn Asn Ser Arg 
685 690 695 

ACC TAC GGT GGA ACT TTA TTC TTC AAG CAC TCT CAT ACC CTA CAA CCC 2343 
Thr Tyr Gly Gly Thr Leu Phe Phe Lys His Ser His Thr Leu Gin Pro 
700 705 710 
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CAA AAC TAT TTG AGA TTA GGA AGA GCA AAG TTT TCT GAA TCA GCT ATA 23 91 
Gin Asn Tyr Leu Arg Leu Gly Arg Ala Lys Phe Ser Glu Ser Ala lie 
715 720 725 

GAA AAA TTC CCT AGG GAA ATT CCC CTA GCC TTG GAT GTC CAA GTT TCG 2 43 9 
Glu Lys Phe Pro Arg Glu lie Pro Leu Ala Leu Asp Val Gin Val Ser 
730 735 740 745 

TTC AGC CAT TCA GAC AAC CGT ATG GAA ACG CAC TAT ACC TCA TTG CCA 2487 
Phe Ser His Ser Asp Asn Arg Met Glu Thr His Tyr Thr Ser Leu Pro 
750 755 760 

GAA TCC GAA GGT TCT TGG AGC AAC GAG TGT ATA GCT GGT GGT ATC GGC 2 53 5 
Glu Ser Glu Gly Ser Trp Ser Asn Glu Cys lie Ala Gly Gly lie Gly 
765 770 775 

CTA GAC CTT CCT TTT GTT CTT TCC AAC CCA CAT CCT CTT TTC AAG ACC 2583 
Leu Asp Leu~Pro Phe Val Leu ~Ser~Asn Pro His -pro ~Leu ~ Phe Lys Thr 
780 785 790 

TTC ATT CCA CAG ATG AAA GTC GAA ATG GTT TAT GTA TCA CAA AAT AGC 2 631 

Phe lie Pro Gin Met Lys Val Glu Met Val Tyr Val Ser Gin Asn Ser 
795 800 805 

TTC TTC GAA AGC TCT AGT GAT GGC CGT GGT TTT AGT ATT GGA AGG CTG 2 67 9 
Phe Phe Glu Ser Ser Ser Asp Gly Arg Gly Phe Ser lie Gly Arg Leu 
810 815 820 825 

CTT AAC CTC TCG ATT CCT GTG GGT GCG AAA TTC GTG CAG GGG GAT ATC 2 72 7 
Leu Asn Leu Ser He Pro Val Gly Ala Lys Phe Val Gin Gly Asp He 
830 835 840 

GGA GAT TCC TAC ACC TAT GAT CTC TCA GGA TTC TTT GTT TCC GAT GTC 2775 
Gly Asp Ser Tyr Thr Tyr Asp Leu Ser Gly Phe Phe Val Ser Asp Val 
845 850 855 

TAT CGT AAC AAT CCC CAA TCT ACA GCG ACT CTT GTG ATG AGC CCA GAC 2823 
Tyr Arg Asn Asn Pro Gin Ser Thr Ala Thr Leu Val Met Ser Pro Asp 
860 865 870 

TCT TGG AAA ATT CGC GGT GGC AAT CTT TCA AGA CAG GCA TTT TTA CTG 2871 
Ser Trp Lys He Arg Gly Gly Asn Leu Ser Arg Gin Ala Phe Leu Leu 
875 880 885 

AGG GGT AGC AAC AAC TAC GTC TAC AAC TCC AAT TGT GAG CTC TTC GGA 2 919 
Arg Gly Ser Asn Asn Tyr Val Tyr Asn Ser Asn Cys Glu Leu Phe Gly 
890 895 900 905 

CAT TAC GCT ATG GAA CTC CGT GGA TCT TCA AGG AAC TAC AAT GTA GAT 2 96 7 
His Tyr Ala Met Glu Leu Arg Gly Ser Ser Arg Asn Tyr Asn Val Asp 
910 915 920 

GTT GGT ACC AAA CTC CGA TT CTAGATTGCT AAAACTCCCT AGTTCTTCTA GGGAG 3 02 2 
Val Gly Thr Lys Leu Arg Phe 
925 

TTTTCTCATA CTTTTAGGGA AATATTTGCT ATAGGGAATG CTTTCCTTGC AAACTGTAAA 3 0 82 
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AAATAACATT TGTCCCTCTT CAAAAAAGAT TTCTTTTAAT AATTT CTAGT TATAATTTTA 3142 
TTTTAAAAAC AGTTAAATAA TTAATAGACA ATAATCTATT CTTATTGACT TCTTTTTT ' 3200 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 928 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: internal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Lys Thr Ser He Pro Trp Val Leu Val Ser Ser Val Leu Ala Phe 

Ser Cys His Leu Gin Ser Leu Ala Asn Glu Glu Leu Leu Ser Pro Asp 

20 25 30 

Asp Ser Phe Asn Gly Asn He Asp Ser Gly Thr Phe Thr Pro Lys Thr 

35 40 45 

Ser Ala Thr Thr Tyr Ser Leu Thr Gly Asp Val Phe Phe Tyr Glu Pro 

50 55 60 

Gly Lys Gly Thr Pro Leu Ser Asp Ser Cys Phe Lys Gin Thr Thr Asd 
65 70 75 

Asn Leu Thr Phe Leu Gly Asn Gly His Ser Leu Thr Phe Gly Phe lie 

8 5 90 95 

Asp Ala Gly Thr His Ala Gly Ala Ala Ala Ser Thr Thr Ala Asn Lys 

100 105 no 

Asn Leu Thr Phe Ser Gly Phe Ser Leu Leu Ser Phe Asp Ser Ser Pro 

115 120 12 5 

Ser Thr Thr Val Thr Thr Gly Gin Gly Thr Leu Ser Ser Ala Gly Gly 

130 135 140 

Val Asn Leu Glu Asn lie Arg Lys Leu Val Val Ala Gly Asn Phe Ser 
" 5 150 155 160 

Thr Ala Asp Gly Gly Ala He Lys Gly Ala Ser Phe Leu Leu Thr Gly 

165 170 175 

Thr Ser Gly Asp Ala Leu Phe Ser Asn Asn Ser Ser Ser Thr Lvs Glv 

180 185 190 

Gly Ala He Ala Thr Thr Ala Gly Ala Arg lie Ala Asn Asn Thr Gly 

195 200 205 

Tyr Val Arg Phe Leu Ser Asn He Ala Ser Thr Ser Gly Gly Ala He 

210 215 220 

Asp Asp Glu Gly Thr Ser He Leu Ser Asn Asn Lys Phe Leu Tyr Phe 

25 230 235 240 

Glu Gly Asn Ala Ala Lys Thr Thr Gly Gly Ala He Cys Asn Thr Lys 

245 250 255 

Ala Ser Gly Ser Pro Glu Leu He He Ser Asn Asn Lys Thr Leu He 

260 265 270 

Phe Ala Ser Asn Val Ala Glu Thr Ser Gly Gly Ala lie His Ala Lys 

r ▼ II 3 280 285 

Lys Leu Ala Leu Ser Ser Gly Gly Phe Thr Glu Phe Leu Arg Asn Asn 

290 295 300 

Val Ser Ser Ala Thr Pro Lys Gly Gly Ala lie Ser He Asp Ala Ser 
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305 310 315 320 

Gly Glu Leu Ser Leu Ser Ala Glu Thr Gly Asn lie Thr Phe Val Arg 

325 330 335 

Asn" Thr Leu Thr Thr Thr Gly Ser Thr Asp Thr Pro Lys Arg Asn Ala 

340 345 350 

He Asn He Gly Ser Asn Gly Lys Phe Thr Glu Leu Arg Ala Ala Lys 

355 360 365 

Asn His Thr He Phe Phe Tyr Asp Pro He Thr Ser Glu Gly Thr Ser 

370 375 380 

Ser Asp Val Leu Lys He Asn Asn Gly Ser Ala Gly Ala Leu Asn Pro 
385 390 395 400 

Tyr Gin Gly Thr He Leu Phe Ser Gly Glu Thr Leu Thr Ala Asp Glu 

405 410 415 

Leu Lys Val Ala Asp Asn Leu Lys Ser Ser Phe Thr Gin Pro Val Ser 

420 425 430 

Leu Ser Gly Gly Lys Leu Leu Leu Gin Lys Gly Val Thr Leu Glu Ser 

435 440 445 

iiii. Cjc=-u rue ocx vjxu uiu "nx a. oiy~oci ~iJcu ^ijeu ijXy _ net — **3p~£>ex ViXy 

450 455 460 

Thr Thr Leu Ser Thr Thr Ala Gly Ser He Thr He Thr Asn Leu Gly 
465 470 475 480 

He Asn Val Asp Ser Leu Gly Leu Lys Gin Pro Val Ser Leu Thr Ala 

485 490 495 

Lys Gly Ala Ser Asn Lys Val He Val Ser Gly Lys Leu Asn Leu He 

500 505 510 

Asp He Glu Gly Asn He Tyr Glu Ser His Met Phe Ser His Asp Gin 

515 520 525 

Leu Phe Ser Leu Leu Lys He Thr Val Asp Ala Asp Val Asp Thr Asn 

530 535 540 

Val Asp He Ser Ser Leu He Pro Val Pro Ala Glu Asp Pro Asn Ser 
545 550 555 560 

Glu Tyr Gly Phe Gin Gly Gin Trp Asn Val Asn Trp Thr Thr Asp Thr 

565 570 575 

Ala Thr Asn Thr Lys Glu Ala Thr Ala Thr Trp Thr Lys Thr Gly Phe 

580 585 590 

Val Pro Ser Pro Glu Arg Lys Ser Ala Leu Val Cys Asn Thr Leu Trp 

595 600 605 

Gly Val Phe Thr Asp lie Arg Ser Leu Gin Gin Leu Val Glu He Gly 

610 615 620 

Ala Thr Gly Met Glu His Lys Gin Gly Phe Trp Val Ser Ser Met Thr 
625 630 635 640 

Asn Phe Leu His Lys Thr Gly Asp Glu Asn Arg Lys Gly Phe Arg His 

645 650 655 

Thr Ser Gly Gly Tyr Val He Gly Gly Ser Ala His Thr Pro Lys Asp 

660 665 670 

Asp Leu Phe Thr Phe Ala Phe Cys His Leu Phe Ala Arg Asp Lys Asp 

675 680 685 

Cys Phe He Ala His Asn Asn Ser Arg Thr Tyr Gly Gly Thr Leu Phe 

690 695 700 

Phe Lys His Ser His Thr Leu Gin Pro Gin Asn Tyr Leu Arg Leu Gly 
705 710 715 720 

Arg Ala Lys Phe Ser Glu Ser Ala lie Glu Lys Phe Pro Arg Glu He 

725 730 735 

Pro Leu Ala Leu Asp Val Gin Val Ser Phe Ser His Ser Asp Asn Arg 

740 745 750 

Met Glu Thr His Tyr Thr Ser Leu Pro Glu Ser Glu Gly Ser Trp Ser 
755 760 765 
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Asn Glu Cys He Ala Gly Gly He Gly Leu Asp Leu Pro Phe Val Leu 

7 70 775 780 

Ser Asn Pro His Pro Leu Phe Lys Thr Phe He Pro Gin Met Lys Val 
785 790 795 800 

Glu Met Val Tyr Val Ser Gin Asn Ser Phe Phe Glu Ser Ser Ser Asp 

805 810 815 

Gly Arg Gly Phe Ser He Gly Arg Leu Leu Asn Leu Ser He Pro Val 

820 825 830 

Gly Ala Lys Phe Val Gin Gly Asp He Gly Asp Ser Tyr Thr Tyr Asp 

835 840 845 

Leu Ser Gly Phe Phe Val Ser Asp Val Tyr Arg Asn Asn Pro Gin Ser 

850 855 860 

Thr Ala Thr Leu Val Met Ser Pro Asp Ser Trp Lys He Arg Gly Gly 
865 870 875 880 

Asn Leu Ser Arg Gin Ala Phe Leu Leu Arg Gly Ser Asn Asn Tyr Val 

885 890 895 

Tyr Asn Ser Asn Cys Glu Leu Phe Gly His Tyr Ala Met Glu Leu Arg 

900 905 910 

Gly Ser Ser Arg Asn Tyr Asn Val Asp Val Gly Thr Lys Leu Arg Phe 



(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 815 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 

ATGAAAT CG C AATTTTC CTG GTTAGTGCTC TCTTCGACAT TGGCATGTTT TACTAGTTGT 60 

TCCACTGTTT TTGCTGCAAC TGCTGAAAAT ATAGGCCCCT CTGATAGCTT TGACGGAAGT 12 0 

ACTAACACAG GCACCTATAC TCCTAAAAAT ACGACTACTG GAATAGACTA TACTCTGACA 180 

GGAGATATAA CTCTGCAAAA CCTTGGGGAT TCGGCAGCTT TAACGAAGGG TTGTTTTTCT 240 

GACACTACGG AATCTTTAAG CTTTGCCGGT AAGGGGTACT CACTTTCTTT TTTAAATATT 3 00 

AAGTCTAGTG CTGAAGGCGC AGCACTTTCT GTTACAACTG ATAAAAATCT GTCGCTAACA 360 

GGATTTTCGA GTCTTACTTT CTTAGCGGCC CCATCATCGG TAATCACAAC CCCCTCAGGA 420 

AAAGGTG C AG TTAAATGTGG AGGGGATCTT ACATTTGATA ACAATGGAAC TATTTTATTT 480 

AAACAAGATT ACTGTGAGGA AAATGGCGGA GCCATTTCTA CCAAGAATCT TTCTTTGAAA 540 

AACAGCACGG GATCGATTTC TTTTGAAGGG AATAAATCGA GCGCAACAGG GAAAAAAGGT 600 

GGGGCTATTT GTGCTACTGG TACTGTAGAT ATTACAAATA ATACGGCTCC TACCCTCTTC 660 

TCGAACAATA TTGCTGAAGC TGCAGGTGGA GCTATAAATA GCACAGGAAA CTGTACAATT 720 

ACAGGGAATA CGTCTCTTGT ATTTTCTGAA AATAGTGTGA CAGCGACCGC AGGAAATGGA 780 

GGAGCTCTTT CTGGAGATGC CGATGTTACC ATATCTGGGA ATCAGAGTGT AACTTTCTCA 840 

GGAAACCAAG CTGTAGCTAA TGGCGGAGCC ATTTATGCTA AGAAGCTTAC ACTGGCTTCC 900 

GGGGGGGGGG GGGGTATCTC CTTTTCTAAC AATATAGTCC AAGGTAC CAC TGCAGGTAAT 960 

GGTGGAGCCA TTTCTATACT GGCAGCTGGA GAGTGTAGTC TTTCAGCAGA AG CAGGGG AC 1020 

ATTACCTTCA ATGGGAATGC CATTGTTGCA ACTACACCAC AAACTACAAA AAGAAATTCT 1080 

ATTGACATAG GATCTACTGC AAAGATCACG AATTTACGTG CAATATCTGG GCATAGCATC 1140 

TTTTTCTACG ATCCGATTAC TGCTAATACG GCTGCGGATT CTACAGATAC TTTAAATCTC 1200 

AATAAGGCTG ATGCAGGTAA TAGTACAGAT TATAGTGGGT CGATTGTTTT TTCTGGTGAA 12 60 
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AAGCTCTCTG AAGATGAAGC AAAAGTTGCA GACAACCTCA CTTCTACGCT GAAGCAGCCT 132 0 

GTAACTCTAA CTG C AGG AAA TTTAGTACTT AAACGTGGTG TCACTCTCGA TACGAAAGGC 13 80 

TTT ACT CAG A CCGCGGGTTC CTCTGTTATT ATGGATGCGG GCACAACGTT AAAAGCAAGT 1440 

ACAGAGGAGG TCACTTTAAC AGGTCTTTCC ATTCCTGTAG ACTCTTTAGG CGAGGGTAAG 1500 

AAAGTTGTAA TTGCTGCTTC TGCAGCAAGT AAAAATGTAG CCCTTAGTGG TCCGATTCTT 1560 

CTTTTGGATA ACCAAGGGAA TGCTTATGAA AATCACGACT TAGGAAAAAC TCAAGACTTT 1620 

TCATTTGTGC AGCTCTCTGC TCTGGGTACT GCAACAACTA CAGATGTTCC AGCGGTTCCT 1680 

ACAGTAGCAA CTCCTACGCA CTATGGGTAT CAAGGTACTT GGGGAATGAC TTGGGTTGAT 1740 

GATACCGCAA GCACTCCAAA GACTAAGACA GCGACATTAG CTTGGACCAA TACAGGCTAC 1800 

CTTCCGAATC CTGAGCGTCA AGGAC CTTTA GTTCCTAATA GCCTTTGGGG ATCTTTTTCA 1860 

GACATCCAAG CGATTCAAGG TGT CATAG AG AGAAGTGCTT TGACTCTTTG TTCAGATCGA 1920 

GGCTTCTGGG CTGCGGGAGT CGCCAATTTC TTAGATAAAG ATAAGAAAGG GGAAAAACGC 1980 

AAATACCGTC ATAAATCTGG TGGATATGCT ATCGGAGGTG CAGCGCAAAC TTGTT CTGAA 2 040 

AACTTAATTA GCTTTGCCTT TTGCCAACTC TTTGGTAGCG ATAAAGATTT CTTAGTCGCT 2100 

AAAAAT CAT A CTG AT AC CTA TGCAGGAGCC TTCTATATCC AACACATTAC AGAATGTAGT 2160 

GGGTTCATAG GTTGTCTCTT AGATAAACTT CCTGGCTCTT GGAGTCATAA ACCCCTCGTT 2220 

TTAGAAGGGC AGCTCGCTTA TAGCCACGTC AGTAATGATC TGAAGACAAA GTATACTGCG 22 80 

"TATCCTGAGG" TGAAAGGTTC TTGGGGGAAT AATGCTTTTA "ACATGATGTT"GGGAGCTTCT 2 3"4"0 

TCTCATTCTT ATCCTGAATA CCTGCATTGT TTTGATACCT ATGCTCCATA CATCAAACTG 2400 

AATCTGACCT ATATACGTCA GGACAGCTTC TCGGAGAAAG GTACAGAAGG AAG AT CTTTT 2 4 60 

GATGACAGCA ACCTCTTCAA TTTATCTTTG CCTATAGGGG TGAAGTTTGA GAAGTTCTCT 2 52 0 

GATTGTAATG ACTTTTCTTA TGATCTGACT TTATCCTATG TTCCTGATCT TATCCGCAAT 2 580 

GATCCCAAAT GCACTACAGC ACTTGTAATC AGCGGAGCCT CTTGGGAAAC TTATGCCAAT 2 640 

AACTTAGCAC GACAGGCCTT GCAAGTGCGT GCAGGCAGTC ACTACGCCTT CTCTCCTATG 2 700 

TTTGAAGTGC TCGGCCAGTT TGTCTTTGAA GTTCGTGGAT CCTCACGGAT TTATAATGTA 2760 

GATCTTGGGG GTAAGTTCCA ATTCTAGGAG CGTCTCTCAT GTCTCAGAAA TTCTG 2 815 



(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 92 8 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

( D ) TOPOLOGY : 1 inear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 



Met Lys Ser Gin Phe 

1 5 
Phe Thr Ser Cys Ser 
20 

Pro Ser Asp Ser Phe 
35 

Lys Asn Thr Thr Thr 
50 

Leu Gin Asn Leu Gly 
65 

Asp Thr Thr Glu Ser 
85 

Phe Leu Asn lie Lys 
100 

Thr Asp Lys Asn Leu 
115 

Ala Ala Pro Ser Ser 
130 



Ser Trp Leu Val Leu 
10 

Thr Val Phe Ala Ala 
25 

Asp Gly Ser Thr Asn 
40 

Gly lie Asp Tyr Thr 
55 

Asp Ser Ala Ala Leu 
70 

Leu Ser Phe Ala Gly 
90 

Ser Ser Ala Glu Gly 
105 

Ser Leu Thr Gly Phe 
120 

Val lie Thr Thr Pro 
135 



Ser Ser Thr* Leu Ala Cys 
15 

Thr Ala Glu Asn lie Gly 
30 

Thr Gly Thr Tyr Thr Pro 
45 

Leu Thr Gly Asp lie Thr 
60 

Thr Lys Gly Cys Phe Ser 
75 80 
Lys Gly Tyr Ser Leu Ser 
95 

Ala Ala Leu Ser Val Thr 
110 

Ser Ser Leu Thr Phe Leu 
125 

Ser Gly Lys Gly Ala Val 
140 
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* Lys Cys Gly Gly 
145 

Lys Gin Asp Tyr 

Leu Ser Leu Lys 
180 

Ser Ser Ala Thr 
195 

Val Asp He Thr 
210 

Ala Glu Ala Ala 
225 

Thr Gly Asn Thr 

Ala Gly Asn Gly 
260 

Gly Asn Gin Ser 
275 

Gly Ala He Tyr 
290 

Gly He Ser Phe 
305 

Gly Gly Ala He 

Glu Ala Gly Asp 
340 

Pro Gin Thr Thr 
355 

He Thr Asn Leu 
370 

Pro He Thr Ala 
385 

Asn Lys Ala Asp 

Phe Ser Gly Glu 
420 

Leu Thr Ser Thr 
435 

Val Leu Lys Arg 
450 

Ala Gly Ser Ser 
465 

Thr Glu Glu Val 

Gly Glu Gly Lys 
500 

Val Ala Leu Ser 
515 

Tyr Glu Asn His 
530 

Leu Ser Ala Leu 
545 

Thr Val Ala Thr 

Thr Trp Val Asp 
580 

Leu Ala Trp Thr 



Asp Leu Thr Phe 
150 

Cys Glu Glu Asn 
165 

Asn Ser Thr Gly 

Gly Lys Lys Gly 
200 

Asn Asn Thr Ala 
215 

Gly Gly Ala He 
230 

Ser Leu Val Phe 
245 

Gly Ala Leu Ser 

Val Thr Phe Ser 
280 

Ala Lys Lys Leu 
295 

Ser Asn Asn He 
310 

Ser He Leu Ala 
325 

lie Thr Phe Asn 

Lys Arg Asn Ser 
360 

Arg Ala He Ser 
375 

Asn Thr Ala Ala 
390 

Ala Gly Asn Ser 
405 

Lys Leu Ser Glu 

Leu Lys Gin Pro 
440 

Gly Val Thr Leu 
455 

Val He Met Asp 
470 

Thr Leu Thr Gly 
485 

Lys Val Val He 

Gly Pro He Leu 
520 

Asp Leu Gly Lys 
535 

Gly Thr Ala Thr 
550 

Pro Thr His Tyr 
565 

Asp Thr Ala Ser 
Asn Thr Gly Tyr 



Asp Asn Asn Gly 
155 

Gly Gly Ala He 
170 

Ser He Ser Phe 
185 

Gly Ala He Cys 

Pro Thr Leu Phe 
220 

Asn Ser Thr Gly 
235 

Ser Glu Asn Ser 
250 

Gly Asp Ala Asp 
2 65 

Gly Asn Gin Ala 

Thr Leu Ala Ser 
300 

Val Gin Gly Thr 
315 

Ala Gly Glu Cys 
330 

Gly Asn Ala He 
345 

He Asp He Gly 

Gly His Ser He 
380 

Asp Ser Thr Asp 
395 

Thr Asp Tyr Ser 
410 

Asp Glu Ala Lys 
425 

Val Thr Leu Thr 

Asp Thr Lys Gly 
460 

Ala Gly Thr Thr 
475 

Leu Ser He Pro 
490 

Ala Ala Ser Ala 
505 

Leu Leu Asp Asn 

Thr Gin Asp Phe 
540 

Thr Thr Asp Val 
555 

Gly Tyr Gin Gly 
570 

Thr Pro Lys Thr 
585 

Leu Pro Asn Pro 



Thr He Leu Phe 
160 

Ser Thr Lys Asn 
175 

Glu Gly Asn Lys 
190 

Ala Thr Gly Thr 
205 

Ser Asn Asn He 

Asn Cys Thr He 
240 

Val Thr Ala Thr 
255 

Val Thr He Ser 
270 

Val Ala Asn Gly 
285 

Gly Gly Gly Gly 

Thr Ala Gly Asn 
320 

Ser Leu Ser Ala 
335 

Val Ala Thr Thr 
350 

Ser Thr Ala Lys 
365 

Phe Phe Tyr Asp 

Thr Leu Asn Leu 
400 

Gly Ser He Val 
415 

Val Ala Asp Asn 
430 

Ala Gly Asn Leu 
445 

Phe Thr Gin Thr 

Leu Lys Ala Ser 
480 

Val Asp Ser Leu 
495 

Ala Ser Lys Asn 
510 

Gin Gly Asn Ala 
525 

Ser Phe Val Gin 

Pro Ala Val Pro 
560 

Thr Trp Gly Met 
575 

Lys Thr Ala Thr 
590 

Glu Arg Gin Gly 
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595 










600 










605 








Pro 


Leu 
610 


Val 


Pro 


Asn 


Ser 


Leu 
615 


Trp 


Gly 


Ser 


Phe 


Ser 
620 


Asp 


lie 


Gin 


Ala 


He" 


Gin 


Gly 


Val 


lie 


Glu 


Arg 


Ser 


Ala 


Leu 


Thr 


Leu 


Cys 


Ser 


ASD 


Ara 


625 










630 










635 










640 


Gly 


Phe 


Tro 


Ala 


Ala 


Gly Val 


Ala 


Asn 


Phe 


Leu 


Asp 


Lys 


Asp 


Lys 


Lys 










645 










650 










655 




Gly Glu 


LVS 


Arg 


Lys 


1 yi 


Hr y 


His 


Lvs 


Ser 


Gly Gly Tyr Ala 


lie 


Glv 








660 










665 










670 






Gly Ala 


Ala 


Gin 


Thr 


v*y& 




Glu 


Asn 


Leu 


Tin 
lie 


Q >- 
OCX 


r*ne 


Til = 
AXcL 


Phe 


Cys 






675 










680 










685 








uin 


Leu 


Phe 


Gly Ser 


ASp 


Lys 


Asp 


Phe 


Leu 


V a. X 


Ala 


Lys 


Asn 


His 


Thr 




690 










695 










700 










Asp 


Thr 


Tvr 

x. y j. 


7Vl a 


oxy 




rile 


Tvr 

x y x 


lie 


Gin 


HX S 


1 xe 


lnr 


CaXU 


Cys 


Ser 


705 










710 




















720 


Gly 


Phe 


lie 


m \r 

uiy 


v^ys 


Leu 


Leu 


Asp 


Lys 


Leu 
73 0 


Pro 


oxy 


Ser 


Trp 


Ser 
735 


His 


Lys 


Pro 


Leu 


740 


Leu 


(jiu 


Cily 


Gin 


Leu 
745 


Ala 


Tyr 


"Ser 


His 


vax 

H C f\ 

/bU 


"Ser 


Asn' 


Asp 


Leu 


Lys 


inr 


Lys 


Tyr 


inr 


Ala 


Tvr 
x yx. 


Pro 




vax 


Lys 


GXy 


Ser Trp 






755 










760 










765 








Gly 


Asn 
770 


Asn 


a j. a 


pne 


Asn 


nSL 
77c 


Met 


Leu 


Gly 


Ala 


Ser 
780 


Ser 


His 


Ser 


Tyr 


Pro 


Glu 


Tvr 


Leu 


His 


Cys 


file 


Asp 


Thr 


Tvr 
x y x 


Ala 


Pro 


Tyr 


lie 


Lys 


Leu 


785 










790 










795 










800 


Asn 


Leu 


Thr 


Tyr 


lie 
805 


Arg 


oin 


Asp 




Phe 
810 


Ser 


Glu 


Lys 


Gly 


Thr 
815 


Glu 


Gly 


Arg 


Ser 


Phe 
820 


Asp 


ASp 


Ser 


Asn 


Leu 
825 


Phe 


Asn 


Leu 


Ser 


Leu 
830 


Pro 


He 


Gly Val 


Lys 


Phe 


Glu 


Lys 


Phe 


Ser 


Asp 


\-y & 


Asn 


Asp 


Phe 


Ser 


Tyr Asp 






835 










840 










845 








Leu 


Thr 
850 


Leu 


Ser 


Tyr 


Val 


Pro 
855 


Asp 


Leu 


lie 


Arg 


Asn 
860 


Asp 


Pro 


Lys 


Cys 


Thr 


Thr 


Ala 


Leu 


Val 


lie 


Ser 


uiy 


Ala 


OCl 


Trp 


Glu 


Thr 


Tyr 


Ala 


Asn 


865 










870 










875 










880 


Asn 


Leu 


Ala 


Arg 


Gin 


Ala 


Leu 


Gin 


Val 


Arg 


Ala 


Gly Ser 


His 


Tyr 


Ala 










885 










890 










895 




Phe 


Ser 


Pro 


Met 
900 


Phe 


Glu 


Val 


Leu 


Gly 
905 


Gin 


Phe 


Val 


Phe 


Glu 
910 


Val 


Arg 


Gly 


Ser 


Ser 


Arg 


lie 


Tyr 


Asn 


Val 


Asp 


Leu 


Gly Gly 


Lys 


Phe 


Gin 


Phe 






915 










920 










925 









(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 052 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 

ATGCGATTTT CGCTCTGCGG ATTTCCTCTA GTTTTTT CTT TAACATTGCT CTCAGTCTTC 60 
GACACTTCTT TGAGTGCTAC TACGATTTCT TTAACCCCAG AAGATAGTTT TCATGGAGAT 120 
AGTCAGAATG CAGAACGTTC TTATAATGTT CAAGCTGGGG ATGT CTATAG CCTTACTGGT 18 0 
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GATGTCTCAA TATCTAACGT CGATAACTCT GCATTAAATA AAGCCTGCTT CAATGTGACC 240 

TCAGGAAGTG TGACGTTCGC AGGAAATCAT CATGGGTTAT ATTTTAATAA TATTTCCTCA 3 00 

^S! ACAA ******** TGTACTTTGT TGCCAAGATC CTCAAGCAAC SSSSS 3 0 

JSSJSJT CCACGCTCTC TTTTATTCAG AGCCCCGGAG ATATTAAAGA ACAGGGATGT 420 

CTCTATTCAA AAAATGCACT TATGCTCTTA AACAATTATG TAGTGCGTTT TGAACAAAAC 480 

CAAAGTAAGA CTAAAGGCGG AGCTATTAGT GGGGCGAATG TTACTATAGT AGGCAACTAC 540 

GATTCCGTCT CTTTCTATCA GAATGCAGCC ACTTTTGGAG GTGCTATCCA TT CTTCAGGT 600 

CCCCTACAGA TTGCAGTAAA TCAGGCAGAG ATAAGATTTG CACAAAATAC TGCCAAGAAT 660 

GGTTCTGGAG GGGCTTTGTA CTCCGATGGT GATATTGATA TTGATCAGAA TgSJJSS 120 

ZlVZTr^Z AAAATGAGGC ATTGACTACT GCTATAGGTA AGGGAGGGGC TGTCTGTTGT 780 

SISSSJ CAGGAAGTAG TACTCCAGTT CCTATTGTGA CTTTCTCTGA CAATAAACAG 840 

TTAGTCTTTG AAAGAAACCA TTCCATAATG GGTGGCGGAG CCATTTATGC TAGGAAACTT 900 

XSSSST CAGG AGGT CC TACTCTATTT ATCAATAATA TATCATATGC Sa^SI 96 

AATTTAGGTG GAGCTATTGC CATTGATACT GGAGGGGAGA TCAGTTTATC AGCAGAGAAA 1020 

GGAACAATTA CATTCCAAGG AAACCGGACG AGCTTACCGT TTTTGAATGG a££S££ 1080 

TTACAAAATG CTAAATTCCT GAAATTACAG GCGAGAAATG GATGCTCTAT AGAATTTTAT 1140 

GATCCTATTA CTTCTGAAGC AGATGGGTCT ACCCAATTGA ATATCAACGG AGATCcJSS \lol 

AATAAAGAGT ACACAGGGAC CATACTCTTT TCTGGAGAAA AGAGTCTAGC AAACGATCCT 1260 

AGGGATTTTA AATCTACAAT CCCTCAGAAC GTCAACCTGT CTGCAGGATA CtJSSSS ^320 

AAAGAGGGGG CCGAAGTCAC AGTTTCAAAA TTCACGCAGT CTCCAGGATC GCATTTAGTT 1380 

SSSJJIS? GAACCAAACT GATAGCCTCT AAGGAAGACA TTGCCAtSc AgSSSgCG 14 4 S 

ATAGATATAG AT AG CTTAAG CTCATCCTCA ACAGCAGCTG TTATTAAAGC AAACACCGCA 1500 
AATAAACAGA TATCCGTGAC GGACTCTATA GAACTTATCT CGCCTACTGG 

GAAGATCTCA GAATGAGAAA TTCACAGACG TTCCCTCTGC TCTCTTTAGA GCCtcSaGCC 1620 

GGGGGTAGTG TGACTGTAAC TGCTGGAGAT TTCCTACCGG TAAGTCC^S SSSSSSS 1680 

caaggcaatt ggaaattagc ttggacagga actggaaaca aagttggaga attcttSgg 1140 

GATAAAATAA ATTATAAGCC TAGACCTGAA AAAGAAGGAA ATTTAGTTCC TAATATcSg IBOO 

TGGGGGAATG CTGTAAATGT CAGATCCTTA ATGCAGGTTC AAGAGACCCA TOoSSE? i860 

7rrr^° ATCGAGGGCT GTGGATCGAT GGAATTGGGA ATTTCTTCCA SSSSSS ^92 

TCCGAAGACA ATATAAGGTA CCGTCATAAC AGCGGTGGAT ATGTTCTATC TGTAAATAAT 1980 

GAGATCACAC CTAAGCACTA TACTTCGATG GCATTTTCCC AACT CTTTAG TAGAGACAAG Hlo 

GACTATGCGG TTTCCAACAA CGAATACAGA ATGTATTTAG GATCGTATCT CTATCAATAT llil 
ACAACCTCCC TAGGGAATAT TTTCCGTTAT GCTTCGCGTA ACCCTAATGT £££££££ 

ATT CTCTCAA GAAGGTTTCT TCAAAATCCT CTTATGATTT TTCATTTTTT GTGTGCTTAT 2220 

GGTCATGCCA CCAATGATAT GAAAACAGAC TACGCAAATT TCCCTATGGT gSS2SS£ 22sS 

TGGAGAAACA ATTGTTGGGC TATAGAGTGC GGAGGGAGCA TGCCTCTATT gg^toaS lllo 

AACGGAAGAC TTTTCCAAGG TGCCATCCCA TTTATGAAAC TACAATTaS tSSSSSE 2400 

CAGGGAGATT tcaaagagac gactgcagat ggccgtagat ttagtaatgg gagtSUS lleo 

tcgatttctg tacctctagg catacgcttt gagaagctgg cactttctca ggatgta^t? 25" 

ZSSZZ"* GTTTCTCCTA TATTCCTGAT ATTTTCCGTA AGGATCCCTC ATOTgSgS ^80 

GCTCTGGTGA TTAGCGGAGA CTCCTGGCTT GTTCCGGCAG CACACGTATC AAGACATGCT 2640 

tttgtaggga gtggaacggg tcggtatcac tttaacgact atactgagc? ctSt^cS 2100 

GGAAGTATAG AATGCCGCCC CCATGCTAGG AATTATAATA TAAACTGTGG SgSStS 2760 

COTTTTTAGA AGGTTTCCAT TGCCTGTGTG GTTCCGGATC TTAACTATAA ATCCTGGACT 2820 

atggatcata ggcattgggt ttctcgaact tgtgtggaga ataacgacat tttatatScI 288? 

TAACGGAATA CTCGTATCAC CTCAGCCCCT AGAGACATTC TTTAGGGGTT SSSSS 29I0 

SSSES JESS^ AGAATCCTTO ACGTTCTTGG TTTGCTTGTC SSSSS 00 

TCTCTAACGA ATCATAGGGA TTCCAGGGTT CTGTTC CTTG AGTCCTTTGG CA 3 052 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 922 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

( D ) TOPOLOGY : 1 inear 
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(ii) MOLECULE TYPE : peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 



Met 


Arci 


Phe 


Ser 


Leu 


Cys 


Glv 

uxy 


Phe 


Pro 


Leu 


Va 1 


Phe 


Ser 


Leu 


J. in 


Leu 


l 








5 










10 










X J 




Leu 


Ser 


Val 


Phe 


Asp 


Thr 


Ser 


XJC LI 


0 cx 


Al a 
nld 


1 Ilx 


Thr 


He 


Ser 


Leu 


Tnr* 

inr 








20 










£. J 










J u 






Pro 


Glu 


Asp 


Ser 


Phe 


His 


Gly 


Ben 


0 CX 


VJlll 


Asn 


Ala 


Glu 


Arg 


OCX 


Tyr 






35 










4 0 










45 








Asn 


Val 


Gin 


AT a 

rtX CL 


m v 

ui y 


A en 


v ctx 


iyr 


Q 0 V 
OCX 


Leu 


Tnr 


Gly Asp 


vai 


ber 


Ti- 
ne 














D D 










60 












A en 


V CtX 


A crs 


Asn 


Q O Y* 

OCX 


Al a 


Leu 


Asn 


Lys 


A 1 

Aia 


Cys 


Phe 


Asn 


vai 


inr 












70 










75 










Q Pi 

O u 


Car 
OCX 




OCX 


Ua 1 


1 ill 


ir ilc 


Ala 
H.1 a. 


uiy 


Asn 


HIS 


His 


Gly 


Leu 


Tyr 


Pne 


Asn 










O D 










Q Pi 














Asn 


Tip 

lie 


C o v 

OCX 


Q 

OCX 


P ~\ ir 


± nr 


i nr 


Lys 


Pin 

blU 


P 1 i r 

Oiy 


Ala 


Val 


Leu 


Cys 


Cys 


Gin 


















i~rvc 
1 US 










110 






Asp 


Pro 


L3 111 


AT =3 

/-ix a 


i nr 


f\± 3. 


Arg 


pne 


ber 


Cjly 


Phe 


Ser 


Thr 


Leu 


Ser 


Phe 






1 1 c 

X X -J 










ion 










125 








Tip 

lie 


pi ti 


O CX 




uiy 


Asp 


Tl Q 

lie 


Lys 


La 1U 


p 1 -r-s 

Cain 


Gly 


Cys 


Leu 


Tyr 


Ser 


Lys 




-L J \J 










1 J) O 










140 












Al ^ 
riX cl 


Leu 




Leu 


Leu 


Asn 


Asn 


Tyr 


vai 


Val 


Arg 


Phe 


Glu 


Gin 


Asn 


145 










ljU 










155 










i a n 
lb U 




Oar 


Lys 


X ilx 


Lys 


uiy 


nl 
vaiy 


Rio 
a.i a 


Ti- 
ne 


Ser 


Gly Ala 


Asn 


Val 


Thr 


lie 




















1 /U 










175 




V ClX 


uiy 


Asn 


Tyr 


Asp 


Ser 




C tim- 
ber 


O Vi a 

rne 


Tyr 


Gin 


Asn 


Ala 


Ala 


Thr 


Pne 








X o u 










1 G C 

loo 










190 






pi \r 


\jj _L y 


ni CL 


lie 


nlS 


Ser 


Ser 


pi,, 
oiy 


Pro 


Leu 


Gin 


He 


Ala 


Val 


Asn 


Gin 






195 










*> pi n 










205 










ul U 


1 lc 


Arg 


DVi a. 
rflc 


TV! 3 

Aia 


uin 


Asn 


1 nr 


7\ 1 ^1 

Ala 


Lys 


Asn 


Gly Ser Gly 


Gly 




6iU 










Z 1 D 










220 










Ala 


Leu 


lyr 


Q <n -y- 


Asp 




Asp 


lie 


Asp 


Ti- 
ne 


Asp 


Gin 


Asn 


Ala 


Tyr 


vai 


225 










9 ^ n 

^ O \J 










235 










1 a n 
U 


Leu 


Phe 




Glu 


Asn 


P 1 n 


Ala 

M.1 a. 


Leu 


1 nr 


inr 


Ala 


He 


Gly 


Lys 


Gly 


(jiy 










^ *± j 










O EC Pl 










255 




Ala 


Val 


Cys 




Leu 


Pro 


i nr 


Cor 

oer 


pi , r 
\j±y 


ber 


Ser 


Thr 


Pro 


Val 


Pro 


T 1 a 

lie 








2 60 










Z O D 










270 






Val 


Thr 


Phe 


Ser 




Asn 


Lys 


P 1 -r-i 

uin 


Leu 


y 3 i 

vai 


Phe 


Glu 


Arg 


Asn 


His 


Ser 






275 










■5 ft pi 










285 








lie 


Met 


Gly 


vj x y 


ox y 


Ala 
.Ml ct 


Tip 

lie 


Tyr 


a l = 
Aid 


Arg 


Lys 


Leu 


Ser 


He 


Ser 


ber 




290 




















300 










Glv 


Gly 


Pro 


Thr 


iJCU 


rile 


Tip 

lie 


Asn 


Asn 


Tl ft 

lie 


Ser 


Tyr 


Ala 


Asn 


Ser 


PI Tl 

bin 


305 










310 










315 










*5 *> Pi 


Asn 


Leu 


Glv 
\jx y 


Gly 


Ala 


Tip 
IXC 


Al a 


X X c 


Asp 


1 nr 


Gly 


Gly 


Glu 


He 


Ser 


Leu 










32 5 










PL 










335 




Ser 


Ala 


Glu 


Lys 


Gly 


Thr 


He 


Thr 


Phe 


Gin 


Gly Asn 


Arg 


Thr 


Ser 


Leu 








340 










345 










350 






Pro 


Phe 


Leu 


Asn 


Gly 


He 


His 


Leu 


Leu 


Gin 


Asn 


Ala 


Lys 


Phe 


Leu 


Lys 






355 










360 










365 








Leu 


Gin 


Ala 


Arg 


Asn 


Gly 


Cys 


Ser 


He 


Glu 


Phe 


Tyr 


Asp 


Pro 


He 


Thr 




370 










375 










380 










Ser 


Glu 


Ala 


Asp 


Gly 


Ser 


Thr 


Gin 


Leu 


Asn 


He 


Asn 


Gly Asn 


Pro 


Lys 


385 










390 










395 










400 


Asn 


Lys 


Glu 


Tyr 


Thr 


Gly 


Thr 


He 


Leu 


Phe 


Ser 


Gly 


Glu 


Lys 


Ser 


Leu 










405 










410 










415 




Ala 


Asn 


Asp 


Pro 


Arg 


Asp 


Phe 


Lys 


Ser 


Thr 


He 


Pro 


Gin 


Asn 


Val 


Asn 
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Leu Ser Ala Gly 
435 

Ser" Lys Phe Thr 
450 

Thr Lys Leu lie 
465 

lie Asp lie Asp 

Ala Asn Thr Ala 
500 

He Ser Pro Thr 
515 

Gin Thr Phe Pro 
530 

Thr Val Thr Ala 
545 

Gin Gly Asn Trp 

Glu Phe Phe Trp 
580 

Gly Asn Leu Val 
595 

Ser Leu Met Gin 
610 

Arg Gly Leu Trp 
625 

Ser Glu Asp Asn 

Ser Val Asn Asn 
660 

Ser Gin Leu Phe 
675 

Tyr Arg Met Tyr 
690 

Gly Asn lie Phe 
705 

He Leu Ser Arg 

Leu Cys Ala Tyr 
740 

Asn Phe Pro Met 
755 

Glu Cys Gly Gly 
770 

Phe Gin Gly Ala 
785 

Gin Gly Asp Phe 

Gly Ser Leu Thr 
820 

Leu Ala Leu Ser 
835 

Pro Asp He Phe 
850 

Ser Gly Asp Ser 
865 



Tyr Leu Val He 
440 

Gin Ser Pro Gly 
455 

Ala Ser Lys Glu 
470 

Ser Leu Ser Ser 
485 

Asn Lys Gin He 

Gly Asn Ala Tyr 
520 

Leu Leu Ser Leu 
535 

Gly Asp Phe Leu 
550 

Lys Leu Ala Trp 
565 

Asp Lys He Asn 

Pro Asn He Leu 
600 

Val Gin Glu Thr 
615 

He Asp Gly He 
630 

He Arg Tyr Arg 
645 

Glu lie Thr Pro 

Ser Arg Asp Lys 
680 

Leu Gly Ser Tyr 
695 

Arg Tyr Ala Ser 
710 

Arg Phe Leu Gin 
725 

Gly His Ala Thr 

Val Lys Asn Ser 
760 

Ser Met Pro Leu 
775 

He Pro Phe Met 
790 

Lys Glu Thr Thr 
805 

Ser He Ser Val 

Gin Asp Val Leu 
840 

Arg Lys Asp Pro 
855 

Trp Leu Val Pro 
870 



425 

Lys Glu Gly Ala 

Ser His Leu Val 
460 

Asp He Ala He 
475 

Ser Ser Thr Ala 
490 

Ser Val Thr Asp 
505 

Glu Asp Leu Arg 

Glu Pro Gly Ala 
540 

Pro Val Ser Pro 
555 

Thr Gly Thr Gly 
570 

Tyr Lys Pro Arg 
585 

Trp Gly Asn Ala 

His Ala Ser Ser 
620 

Gly Asn Phe Phe 
635 

His Asn Ser Gly 
650 

Lys His Tyr Thr 
665 

Asp Tyr Ala Val 

Leu Tyr Gin Tyr 
700 

Arg Asn Pro Asn 
715 

Asn Pro Leu Met 
730 

Asn Asp Met Lys 
745 

Trp Arg Asn Asn 

Leu Val Phe Glu 
780 

Lys Leu Gin Leu 
795 

Ala Asp Gly Arg 
810 

Pro Leu Gly He 
825 

Tyr Asp Phe Ser 

Ser Cys Glu Ala 
860 

Ala Ala His Val 
875 



430 

Glu Val Thr Val 
445 

Leu Asp Leu Gly 

Thr Gly Leu Ala 
480 

Ala Val He Lys 
495 

Ser lie Glu Leu 
510 

Met Arg Asn Ser 
525 

Gly Gly Ser Val 

His Tyr Gly Phe 
560 

Asn Lys Val Gly 
575 

Pro Glu Lys Glu 
590 

Val Asn Val Arg 
605 

Leu Gin Thr Asp 

His Val Ser Ala 
640 

Gly Tyr Val Leu 
655 

Ser Met Ala Phe 
670 

Ser Asn Asn Glu 
685 

Thr Thr Ser Leu 

Val Asn Val Gly 
720 

He Phe His Phe 
735 

Thr Asp Tyr Ala 
750 

Cys Trp Ala He 
765 

Asn Gly Arg Leu 

Val Tyr Ala Tyr 
800 

Arg Phe Ser Asn 
815 

Arg Phe Glu Lys 
830 

Phe Ser Tyr He 
845 

Ala Leu Val He 

Ser Arg His Ala 
880 
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Phe 



Val Gly Ser 



Gly Thr Gly Arg Tyr 
885 



His Phe Asn Asp Tyr Thr Glu 
890 895 
Arg Pro His Ala Arg Asn Tyr 



Asn 



Leu 



Leu Cys Arg 
900 

lie Asn Cys 



Gly Ser lie Glu Cys 
905 

Gly Ser Lys Phe Arg 
920 




915 



(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 52 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: Genomic DNA 

Cxi) SEQUENCE DESCRTPTI0N7 SEQ ID""NO:7: 

ATGAAGATTC CACTCCGCTT TTTATTGATA TCATTAGTAC CTACGCTTTC TATGTCGAAT 60 
TTATTAGGAG CTGCTACTAC CGAAGAGCTA TCGGCTAGCA ATAGCTTCGA TGGAACTACA 12 0 

TCAACAACAA G CTTTT CT AG TAAAACATCA TCGGCTACAG ATGGCACCAA TTATG TTTTT 180 

AAAGATTCTG TAGTTATAGA AAATGTACCC AAAACAGGGG AAACTCAGTC TACTAGTTGT 240 

TTTAAAAATG ACGCTGCAGC TGGAGATCTA AATTTCTTAG GAGGGGGATT TTCTTTCACA 3 00 

TTTAGCAATA TCGATGCAAC CACGGCTTCT GGAGCTGCTA TTGGAAGTGA AG CAG CTAAT 3 60 

AAGACAGTCA CGTTATCAGG ATTTTCGGCA CTTTCTTTTC TTAAATCCCC AG CAAGT ACA 42 0 

GTGACTAATG GATTGGGAGC TAT C AATGTT AAAGGGAATT TAAGCCTATT GGATAATGAT 4 80 

AAGGTATTGA TTCAGGACAA TTTCTCAACA GGAGATGGCG GAG CAATT AA TTGTGCAGGC 54 0 

TCCTTGAAGA TCGCAAACAA TAAGTCCCTT TCTTTTATTG GAAATAGTTC TTCAACACGT 600 
GGCGGAGCGA TTCATACCAA AAACCTCACA CTATCTTCTG GTGGGGAAAC TCTATTTCAG 6 60 

GGGAATACAG CGCCTACGGC TGCTGGTAAA GGAGGTGCTA TCGCGATTGC AGACTCTGGC 72 0 
ACCCTATCCA TTTCTGGAGA CAGTGGCGAC ATTATCTTTG AAGGCAATAC GATAGGAGCT 78 0 

ACAGGAACCG T CTCT CAT AG TG CTATTG AT TTAGGAACTA G CGCTAAG AT AACTGCGTTA 84 0 
CGTGCTGCGC AAGGACATAC GATATACTTT TATGATCCGA TTACTGTAAC AGGATCGACA 900 
TCTGTTGCTG ATGCTCTCAA TATTAATAGC CCTGATACTG GAGATAACAA AGAGTATACG 96 0 

GGAAC CAT AG TCTTTTCTGG AGAGAAGCTC ACGGAGGCAG AAGCTAAAGA TGAGAAGAAC 1020 

CGCACTTCTA AATTACTTCA AAATGTTGCT TTTAAAAATG GGACTGTAGT TTTAAAAGGT 1080 

GATGTCGTTT TAAGTGCGAA CGGTTTCTCT CAGGATGCAA ACT CTAAGTT GATTATGGAT 114 0 

TTAGGGACGT CGTTGGTTGC AAACACCGAA AGTATCGAGT TAACGAATTT GGAAATTAAT 1200 

ATAGACT CTC TCAGGAACGG GAAAAAGATA AAACTCAGTG CTGCCACAGC TCAGAAAGAT 12 60 

ATT CGT AT AG ATCGTCCTGT TGTACTGGCA ATTAGCGATG AGAGTTTTTA TCAAAATGGC 1320 

TTTTTGAATG AGGAC CATTC CTATGATGGG ATT CTTGAGT TAGATGCTGG GAAAGACATC 13 80 

GTGATTTCTG CAGATTCTCG CAGTATAAAT GCTGTACAAT CTCCGTATGG CTATCAGGGA 144 0 

AAGTGGACAA TCAATTGGTC TACTGATGAT AAGAAAGCTA CGGTTTCTTG GGCAAAGCAA 1500 

AGTTTTAATC C CACTGCTG A GCAGGAGGCT CCGTTAGTTC CTAATCTTCT TTGGGGTTCT 1560 

TTTATAGATG TTCGTCCCTT CCAAAATTTT ATAGAGCTAG GTACTGAAGG TGCTCCTTAC 1620 

GAAAAGAGAT TTTGGGTTGC AGGCATTTCC AATGTT TTGC AT AGGAG CGG TCGTGAAAAT 1680 

CAAAGGAAAT TCCGTCATGT GAGTGGAGGT GCTGTAGTAG GTGCTAG CAC GAGGATGCCG 1740 

GGTGGTGATA CCTTGTCTCT GGGTTTTG CT CAGCTCTTTG CGCGTGACAA AGACTACTTT 1800 

ATGAAT AC C A ATTTCGCAAA GACCTACGCA GGATCTTTAC GTTTGCAGCA CGATGCTTCC 1860 

CTATACTCTG TGGTGAGTAT CCTTTTAGGA GAGGGAGGAC TCCGCGAGAT CCTGTTGCCT 1920 

TATGTTTCCA AGACTCTGCC GTGCTCTTTC TATGGGCAGC TTAG CTACGG CCATACGGAT 1980 

CAT CG C ATG A AGACCGAGTC TCTACCCCCC CCCCCCCCGA CGCTCTCGAC GGATCATACT 2 040 

TCTTGGGGAG GATATGTCTG GGCTGGAGAG CTGGGAACTC GAGTTGCTGT TGAAAATACC 2100 

AGCGGCAGAG GATTTTTCCG AGAGTACACT CCATTTGTAA AAGTCCAAGC TGTTTACTCG 216 0 

CGCCAAGATA G CTTTGTTG A ACTAG GAG CT AT CAGTCGTG ATTTTAGTGA TTCGCATCTT 2 22 0 

TATAACCTTG CGATTCCTCT TGGAAT CAAG TTAGAGAAAC GGTTTGCAGA GCAATATTAT 2 2 80 
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CATGTTGTAG CGATGTATTC TCCAGATGTT TGT CGTAGT A ACCCCAAATG TACGACTACC 

CTACTTT C C A ACCAAGGGAG TTGGAAGACC AAAGGTTCGA ACTTAGCAAG ACAGGCTGGT 

ATTGTTCAGG CCTCAGGTTT TCGATCTTTG GGAGCTG CAG CAGAGCTTTT CGGGAACTTT 

GGCTTTGAAT GGCGGGGATC TTCTCGTAGC TATAATGTAG ATGCGGGTAG CAAAATCAAA 
TTTTAG 

(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 841 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 



Met 


Lys 


He 


Pro 


Leu 


Arg 


Phe 


Leu 


Leu 


He 


Ser 


Leu 


Val 


Pro 


Thr 


Leu 


1 








5 










10 










15 




Ser 


Met 


Ser 


Asn 
20 


Leu 


Leu 


Gly 


Ala 


Ala 
25 


Thr 


Thr 


Glu 


Glu 


Leu 
30 


Ser 


Ala 


Ser 


Asn 


Ser 


Phe 


Asp 


Gly 


Thr 


Thr 


Ser 


Thr 


Thr 


Ser 


Phe 


Ser 


Ser 


Lys 






35 










40 










45 






Thr 


Ser 


Ser 


Ala 


Thr 


Asp 


Gly 


Thr 


Asn 


Tyr 


Val 


Phe 


Lys 


Asp 


Ser 


Val 




50 










55 










60 






Val 


He 


Glu 


Asn 


val 


Pro 


Lys 


Thr 


Gly 


Glu 


Thr 


Gin 


Ser 


Thr 


Ser 


Cys 


65 










70 










75 










80 


Phe 


Lys 


Asn 


Asp 


Ala 


Ala 


Ala 


Gly 


Asp 


Leu 


Asn 


Phe 


Leu 


Gly Gly Gly 










85 










90 










95 




Phe 


Ser 


Phe 


Thr 


Phe 


Ser 


Asn 


He 


Asp 


Ala 


Thr 


Thr 


Ala 


Ser 


Gly Ala 








100 










105 










110 






Ala 


He 


Gly 


Ser 


Glu 


Ala 


Ala 


Asn 


Lys 


Thr 


Val 


Thr 


Leu 


Ser 


Gly 


Phe 




Ala 


115 










120 










125 






Ser 


Leu 


Ser 


Phe 


Leu 


Lys 


Ser 


Pro 


Ala 


Ser 


Thr 


Val 


Thr 


Asn 


Gly 




130 










135 










140 








Leu 


Gly 


Ala 


He 


Asn 


Val 


Lys 


Gly 


Asn 


Leu 


Ser 


Leu 


Leu 


Asp 


Asn 


Asp 


145 










150 










155 








160 


Lys 


Val 


Leu 


He 


Gin 


Asp 


Asn 


Phe 


Ser 


Thr 


Gly Asp Gly Gly Ala 


He 










165 










170 










175 




Asn 


Cys 


Ala 


Gly 


Ser 


Leu 


Lys 


He 


Ala 


Asn 


Asn 


Lys 


Ser 


Leu 


Ser 


Phe 


He 


Gly 




180 










185 








190 






Asn 


Ser 


Ser 


Ser 


Thr 


Arg 


Gly 


Gly 


Ala 


He 


His 


Thr 


Lys 


Asn 






195 










200 










205 






Leu 


Thr 


Leu 


Ser 


Ser 


Gly 


Gly 


Glu 


Thr 


Leu 


Phe 


Gin 


Gly Asn 


Thr 


Ala 




210 










215 










220 










Pro 


Thr 


Ala 


Ala 


Gly 


Lys 


Gly 


Gly 


Ala 


He 


Ala 


He 


Ala 


Asp 


Ser 


Gly 


225 










230 










235 








240 


Thr 


Leu 


Ser 


He 


Ser 


Gly 


Asp 


Ser 


Gly 


Asp 


He 


lie 


Phe 


Glu 


Gly Asn 


Thr 


He 






245 










250 










255 




Gly 


Ala 


Thr 


Gly 


Thr 


Val 


Ser 


His 


Ser 


Ala 


He 


Asp 


Leu 


Gly 


Thr 






260 










265 










270 




Ser 


Ala 


Lys 


He 


Thr 


Ala 


Leu 


Arg 


Ala 


Ala 


Gin Gly His 


Thr 


He 




Phe 


275 










280 










285 








Tyr 


Tyr 


Asp 


Pro 


He 


Thr 


Val 


Thr 


Gly 


Ser 


Thr 


Ser 


Val 


Ala 


Asp 


Ala 


290 










295 










300 








Leu 


Asn 


He 


Asn 


Ser 


Pro 


Asp 


Thr 


Gly 


Asp 


Asn 


Lys 


Glu 


Tyr 


Thr 



2340 
2400 
2460 
2520 
2526 
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51 



305 










310 










315 










320 


Glv 
oxy 


Thr 


lie 


Val 


Phe 


Ser 


Gly 


Glu 


Lys 


Leu 


Thr 


Glu 


Ala 


OX U. 


Ala 


Lys 










32 5 










330 










3 3 3 




Asp 


Glu 


Lys 


Asn 


Arcr 
iAX y 


Thr 


Ser 


Lys 


Leu 


Leu 


Gin 


nail 


V ctx 


AT a 
rtX d 


rue 


Lys 








340 










345 










350 






Zi on 
noli 


G 1 v 
ox y 


Thr 


Val 


Val 


Leu 


j-i y o 


UX y 


A er^ 
flop 


vctx 


v d x 


Leu 


Cor- 
OCX 


AT a 

ax a 


Asn 


pi 
uiy 






3 55 










36 0 










fi S 
J O 3 








tr lie 


i?CX 


Gin 




Ala 


A en 


OCX 


Xjy a 


T . i 
XiCLl 


Tip 
X X c 


l w ic u 


Asp 


Leu 


p "I , r 

biy 


l nr 


Ser 




370 










375 










3 80 










Leu 


Val 


Ala 


Asn 


Thr 


Glu 


OCX 


Tip 

x x c 


VjXU 


Leu 


X IlX 


Asn 


Leu 


ulu 


Tip 

lie 


Asn 


385 










3 90 










395 










4fl n 

*± \J Lf 


Tip 

lie 


Asp 


O C X. 


Lieu 


Arg 


Asn 


*jxy 


Lys 


Lys 


He 


Lys 


Leu 


her 


Aia 


A 1 -i 

Aia 


mr 










405 










410 










41 c 

*± X 3 




Ala 


VJXll 


j-iy s 


Asp 


lie 


Arg 


Tip 
X X C 


Asp 


Arg 


Pro 


vax 


vax 


Leu 


Aia 


Tl - 

lie 


Ser 








420 










425 










4 7 n 






Asp 


pi 1 1 

ul LI 


o cx 


■r lie 


Tyr 


a 1 n 
o XII 


Asn 


uiy 


Phe 


Leu 


Asn 


bill 


Asp 


T_T ^ _ 

HI S 


Ser 


Tyr 






4 R 
•±33 










4 a n 

4 4 u 










A A C 








ASp 


Giy 


lie 


Leu 


bXu 


_ _ 

Leu 


— 
Asp 


ax a 


Gly Lys - 


"Asp - 


lie 


vai 


l le 


ber 


7>--,-_.-- 

Aia 




4 50 










4 R R 










A C fi 
4 O U 










Asp 


S e x*T 


Arg 


Ser 


Tl Q 

x x e 


Asn 


A 1 -a 

ax a 


T 7 — i 1 

val 


Gin 


Ser 


Pro 


Tyr 


P 1 i r 

Gly 


Tyr 


Gin 


Gly 


465 










4 70 










A 1 K 

4 i 3 










a Q n 
4 o U 


Lys 


Trp 


X XI XV 


Tin 

i xe 


Asn 


Trp 


Ser 


mr 


Asp 


Asp 


Lys 


Lys 


Ala 


Thr 


Val 


Ser 










4 R R 

*x O 3 










490 










4 3 3 




xrp 


AT a 

riX d 


Lys 


p l « 


C /-^ -w 

OCX 


rne 


Asn 


Pro 


Thr 


Ala 


(jIU 


Gin 


CjIU 


Ala 


Pro 


Leu 


















505 










3lU 






VO.J. 


fx O 


Asn 


Leu 


Leu 


Trp 


P 1 ■» r 

VaXy 


Ser 


Phe 


He 


Asp 


val 


Arg 


Pro 


Phe 


Gin 






515 










3^ u 










3Z D 








7\ on 


rile 


ixe 


Glu 


Leu 


uiy 


Tnr 


LjXU 


Gly Ala 


Pro 


Tyr 


CjIU 


Lys 


Arg 


Pne 




53 0 










C"l c 










3 4 U 










Trp 


Val 


Ala 
nX a. 


Gly 


lie 


Ser 


Asn 


vax 


Leu 


His 


Arg 


Ser 


P 1 v r 

Gly 


Arg 


p i ^ » 

GlU 


Asn 


545 




















EC C 

33 3 










3D U 


m t-i 

ulu 


Arg 


Lys 


Phe 


Arg 


nls 


vax 


Ser 


Gly Gly 


Aia 


val 


vai 


Gly 


A 1 -i 

Ala 


Ser 










565 










570 










R *7 
3/3 




THt* 

X I IX. 


Arg 


i v tc c 


Pro Gly 


biy 


Asp 


l nr 


Leu 


Ser 


Leu 


uiy 


fne 


Ala 


P 1 t-i 

Gin 


Leu 








580 










585 










con 








Ct 


Arg 


Asp 


Lys 


Asp 


Tyr 


rne 


Met 


Asn 


inr 


Asn 




A 1 -i 

Aia 


Lys 


Thr 






5 95 










O U U 










O U 3 










Al a 
M.X d 


uiy 


Ser 


Leu 


Arg 


Leu 


Ply, 


His 


Asp 


Al a 

Aia 


Ser 


Leu 


Tyr 


Ser 


vai 




610 










OX J 










o/u 










V CL-L 


Cor 
OCX 


Tip 

X X c 


Leu 


Leu 


p 1 

uxy 


LjX 11 


PI 

tjxy 


Gly 


Leu 


Arg 


ulu 


lie 


Leu 


Leu 


Pro 


625 










63 0 










(TIC 
D J 3 










D4 U 


xyr 


vax 


OCX 


Lys 


Thr 


Leu 


Pro 


Cys 


Ser 


Phe 


Tyr 


uiy 


P 1 r-i 


Leu 


Ser 


Tyr 










645 










650 










a c; zl 

O 3 3 




Gly 


Hi <3 


X XIX 


Asp 


His 


Arg 


Mot- 
Met 


Lys 


Thr 


Glu 


Ser 


Leu 


Pro 


Pro 


Pro 


Fro 








660 










665 










D / U 






Pro 


Thr 


Leu 


Ser 


Thr 


A en 
nop 


nx s 


TV. y- 

X XIX 


Ser 


Trp 


p 1 


uiy 


Tyr 


val 


Trp 


A 1 o 

Aia 






675 










680 










O O 3 








Glv 


Glu 


Leu 


Gly Thr 


A yrr 

Arg 


VAX 


Al a 
AX a. 


Val 


Glu 


Asn 


i nr 


Ser 


uiy 


Arg 


pi, 7 
uiy 




690 










w 7 3 










*7 n n 












trllC 


A rrr 

Arg 


Glu 


Tyr 


1 ill 


Pro 


rlic 


Val 


Lys 


val 


Gin 


Ala 


val 


Tyr 


ser 


705 










710 










715 










720 


Arg 


Gin 


Asp 


Ser 


Phe 


Val 


Glu 


Leu 


Gly Ala 


He 


Ser 


Arg 


Asp 


Phe 


Ser 










725 










730 










735 




Asp 


Ser 


His 


Leu 


Tyr 


Asn 


Leu 


Ala 


He 


Pro 


Leu 


Gly 


He 


Lys 


Leu 


Glu 








740 










745 










750 






Lys 


Arg 


Phe 


Ala 


Glu 


Gin 


Tyr 


Tyr 


His 


Val 


Val 


Ala 


Met 


Tyr 


Ser 


Pro 






755 










760 










765 
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Asp 


va± 


Cys 


Arg 


Ser 


Asn 


Pro 


Lys 




770 










775 




rii ,r 
ul y 


C 1 fty 


Trp 


L»ys 


Thr 


Lys 


Gly 


785 










790 




lie 


Val 


Gin 


Ala 


Ser 
805 


Gly 


Phe 


Arg 


Phe 


Gly 


Asn 


Phe 
820 


Gly 


Phe 


Glu 


Trp 


Val 


Asp 


Ala 
835 


Gly 


Ser 


Lys 


He 


Lys 
840 



52 

Cys Thr Thr Thr Leu Leu Ser Asn 
780 

Ser Asn Leu Ala Arg Gin Ala Gly 
795 800 
Ser Leu Gly Ala Ala Ala Glu Leu 

810 815 
Arg Gly Ser Ser Arg Ser Tyr Asn 
825 830 
Phe 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2787 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE : Genomic DNA 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 



TTTGT CTATG 




AAATGGTAAT 


120 


GG GAAATGTC 




TAACAACACT 


240 


GGTGGATGCA 


300 


CACGTTTATA 


360 


CGGCAAAGGA 


420 


GCTCTTCAGC 


480 


ATTAA CAGGG 


540 


AG CCATTCAG 


600 


TGACAATACT 


660 


TAATAATGCT 


720 


GGGGGATATG 


780 


CCTCACTGGA 


840 


TATCTATGTG 


900 


TGTCAATGGA 


960 


GAGTTTATCC 


1020 


TCCTGGGACG 


1080 


TTCTGCTGCT 


1140 


AGTTACAGAT 


1200 


GAACATCATC 


1260 


TACTTCGAAG 


1320 


AGTGACTCTG 


1380 


AGGAACTACT 


1440 


TTCTATAGAC 


1500 


TTTAT CTGGA 


1560 


AAGAAATCCT 


1620 


CGCAGTGACT 


1680 


GGGCCCAATT 


1740 


TGGCTATATT 


1800 


ATTTATAGAT 


1860 


AG ACCGTG CT 


1920 


ACGACGCGGG 


1980 


TTCAGATAAG 


2040 
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ATTCTTAGTG CTGCATTTTG TCAGCTCTTT GGAAGAGATA GAGACTACTT TGTAGCTAAG 2100 

AAT CAAGGTA CAGTCTACGG AGGAACTCTC T ATT AC CAG C ACAACGAAAC CTATATCTCT 2160 

CTTCCTTGCA AACTACGGCC TTGTTCGTTG TCTTATGTTC CTACAGAGAT TCCTGTTCTC 2220 

TTTTCAGGAA ACCTTAGCTA CACCCATACG GATAACGATC TGAAAACCAA GTATACAACA 22 80 

TATCCTACTG TTAAAGGAAG CTGGGGGAAT GATAGTTTCG CTTTAGAATT CGGTGGAAGA 2340 

GCTCCGATTT GCTTAGATGA AAGTG CTCTA TTTGAG CAGT ACATGCCCTT CATGAAATTG 2 4 00 

CAGTTTGTCT ATGCACATCA GGAAGGTTTT AAAGAACAGG GAACAGAAGC TCGTGAATTT 2 460 

GGAAGTAGCC GTCTTGTGAA TCTTGCCTTA CCTATCGGGA TCCGATTTGA TAAGGAATCA 2 520 

GACTGCCAAG ATGCAACGTA CAATCTAACT CTTGGTTATA CTGTGGATCT TGTTCGTAGT 2 580 

AACCCCGACT GTACGACAAC ACTGCGAATT AGCGGTGATT CTTGGAAAAC CTTCGGTACG 2 640 

AATTTGGCAA GACAAGCTTT AGTCCTTCGT GCAGGGAACC ATTTTTGCTT TAACTCAAAT 2 700 

TTTGAAGCCT TTAGCCAATT TTCTTTTGAA TTGCGTGGGT CATCTCG CAA TTACAATGTA 2 760 

GACTTAGGAG CAAAATACCA ATTCTAA 2 787 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : -92 S— ami-no -ae-i-ds 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 10: 



Met 


Lys 


Ser 


Ser 


Phe 


Pro 


Lys 


Phe 


Val 


Phe 


Ser 


Thr 


Phe 


Ala 


He 


Phe 


l 








5 










10 










15 




Pro 


Leu 


Ser 


Met 


He 


Ala 


Thr 


Glu 


Thr 


Val 


Leu 


Asp 


Ser 


Ser 


Ala 


Ser 








20 










25 










30 






Phe 


Asp 


Gly 


Asn 


Lys 


Asn 


Gly 


Asn 


Phe 


Ser 


Val 


Arg 


Glu 


Ser 


Gin 


Glu 






35 










40 










45 








Asp 


Ala 


Gly 


Thr 


Thr 


Tyr 


Leu 


Phe 


Lys 


Gly Asn 


Val 


Thr 


Leu 


Glu 


Asn 




50 










55 










60 










lie 


Pro 


Gly 


Thr 


Gly 


Thr 


Ala 


He 


Thr 


Lys 


Ser 


Cys 


Phe 


Asn 


Asn 


Thr 


65 










70 










75 










80 


Lys 


Gly 


Asp 


Leu 


Thr 


Phe 


Thr 


Gly Asn 


Gly Asn 


Ser 


Leu 


Leu 


Phe 


Gin 










85 










90 










95 




Thr 


Val 


Asp 


Ala 


Gly 


Thr 


Val 


Ala 


Gly 


Ala 


Ala 


Val 


Asn 


Ser 


Ser 


Val 








100 










105 










110 






Val 


Asp 


Lys 


Ser 


Thr 


Thr 


Phe 


He 


Gly 


Phe 


Ser 


Ser 


Leu 


Ser 


Phe 


He 






115 










120 










125 








Ala 


Ser 


Pro 


Gly 


Ser 


Ser 


He 


Thr 


Thr 


Gly 


Lys 


Gly Ala 


Val 


Ser 


Cys 




130 










135 










140 










Ser 


Thr 


Gly 


Ser 


Leu 


Lys 


Phe 


Asp 


Lys 


Asn 


Val 


Ser 


Leu 


Leu 


Phe 


Ser 


145 










150 










155 










160 


Lys 


Asn 


Phe 


Ser 


Thr 


Asp 


Asn 


Gly Gly 


Ala 


He 


Thr 


Ala 


Lys 


Thr 


Leu 










165 










170 










175 




Ser 


Leu 


Thr 


Gly 


Thr 


Thr 


Met 


Ser 


Ala 


Leu 


Phe 


Ser 


Glu 


Asn 


Thr 


Ser 








180 










185 










190 






Ser 


Lys 


Lys 


Gly 


Gly 


Ala 


He 


Gin 


Thr 


Ser 


Asp 


Ala 


Leu 


Thr 


He 


Thr 






195 










200 










205 








Gly 


Asn 


Gin 


Gly 


Glu 


Val 


Ser 


Phe 


Ser 


Asp 


Asn 


Thr 


Ser 


Ser 


Asp 


Ser 




210 










215 










220 








Gly 


Ala 


Ala 


He 


Phe 


Thr 


Glu 


Ala 


Ser 


Val 


Thr 


He 


Ser 


Asn 


Asn 


Ala 


225 










230 










235 










240 


Lys 


Val 


Ser 


Phe 


He 


Asp 


Asn 


Lys 


Val 


Thr 


Gly Ala 


Ser 


Ser 


Ser 


Thr 
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Thr Gly Asp Met 
260 

Asp Thr Lys Val 
275 

Asn Thr Ser Thr 
290 

Leu Ala Ser Gly 
305 

Gly Thr Ala Pro 

Leu Ser Leu Ser 
340 

Val Thr Ser Thr 
355 

Thr Ser Ala Lys 
370 

Tyr Phe Tyr Asp 
385 

Val Leu Lys Val 

Gly Asn lie lie 
420 

Asp Ser Lys Asn 
435 

Gly Gly Thr Leu 
450 

Phe Thr Gin Gin 
465 

Leu Glu Pro Ala 

Ser Ser lie Asp 
500 

Ser Lys Asn Leu 
515 

Gly Thr Phe Tyr 
530 

lie Leu Glu Leu 
545 

Pro Asp Pro lie 

Trp Gly Pro lie 
580 

Asn Trp Thr Lys 
595 

Leu Val Pro Asn 
610 

His Tyr Leu Met 
625 

Phe Trp Cys Ala 

Thr Arg Arg Gly 
660 

Asn Leu His Thr 
675 

Leu Phe Gly Arg 
690 



245 

Ser Gly Gly Ala 

Thr Leu Thr Gly 
280 

Thr Ala Gly Gly 
295 

Gly Leu Thr Leu 
310 

Lys Gly Gly Ala 
325 

Ala Asp Ser Gly 

Thr Pro Gly Thr 
360 

Met Thr Ala Leu 
375 

Pro lie Thr Thr 
390 

Asn Glu Thr Pro 
405 

Phe Thr Gly Glu 

Leu Thr Ser Lys 
440 

Ser Leu Lys His 
455 

Ala Asp Ser Arg 
470 

Asp Thr. Ser Thr 
485 

Gly Ala Lys Lys 

Thr Leu Ser Gly 
520 

Glu Asn His Ser 
535 

Lys Ala Ser Gly 
550 

Met Gly Glu Lys 
565 

Val Trp Gly Thr 

Thr Gly Tyr lie 
600 

Ser Leu Trp Asn 
615 

Glu Thr Ala Asn 
630 

Gly Leu Ser Asn 
645 

Phe Arg His Leu 

Cys Ser Asp Lys 
680 

Asp Arg Asp Tyr 
695 



250 

lie Cys Ala Tyr 
265 

Asn Gin Met Leu 

Ala He Tyr Val 
300 

Phe Ser Arg Asn 
315 

lie Ala He Glu 
330 

Asp He Val Phe 
345 

Asn Arg Ser Ser 

Arg Ser Ala Ala 
380 

Gly Ser Ser Thr 
395 

Ala Asp Ser Ala 
410 

Lys Leu Ser Glu 
425 

Leu Leu Gin Pro 

Gly Val Thr Leu 
460 

Leu Glu Met Asp 
475 

He Asn Asn Leu 
490 

Ala Lys He Glu 
505 

Thr He Thr Leu 

Leu Arg Asn Pro 
540 

Thr Val Thr Ser 
555 

Phe His Tyr Gly 
570 

Gly Ala Ser Thr 
585 

Pro Asn Pro Glu 

Ala Phe He Asp 
620 

Glu Gly Leu Gin 
635 

Phe Phe His Lys 
650 

Ser Gly Gly Tyr 
665 

He Leu Ser Ala 

Phe Val Ala Lys 
700 



255 

Lys Thr Ser Thr 
270 

Leu Phe Ser Asn 
285 

Lys Lys Leu Glu 

Ser Val Asn Gly 
320 

Asp Ser Gly Glu 
335 

Leu Gly Asn Thr 
350 

He Asp Leu Gly 
365 

Gly Arg Ala He 

Thr Val Thr Asp 
400 

Leu Gin Tyr Thr 
415 

Thr Glu Ala Ala 
430 

Val Thr Leu Ser 
445 

Gin Thr Gin Ala 

Val Gly Thr Thr 
480 

Val He Asn He 
495 

Thr Lys Ala Thr 
510 

Leu Asp Pro Thr 
525 

Gin Ser Tyr Asp 

Thr Ala Val Thr 
560 

Tyr Gin Gly Thr 
575 

Thr Ala Thr Phe 
590 

Arg He Gly Ser 
605 

He Ser Ser Leu 

Gly Asp Arg Ala 
640 

Asp Ser Thr Lys 
655 

Val He Gly Gly 
670 

Ala Phe Cys Gin 
685 

Asn Gin Gly Thr 
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Val Tyr Gly Gly Thr Leu Tyr Tyr Gin His Asn Glu Thr Tyr lie Ser 
705 710 715 720 

Leu Pro Cys Lys Leu Arg Pro Cys Ser Leu Ser Tyr Val Pro Thr Glu 

725 730 735 

lie Pro Val Leu Phe Ser Gly Asn Leu Ser Tyr Thr His Thr Asp Asn 

740 745 750 

Asp Leu Lys Thr Lys Tyr Thr Thr Tyr Pro Thr Val Lys Gly Ser Trp 

755 760 765 

Gly Asn Asp Ser Phe Ala Leu Glu Phe Gly Gly Arg Ala Pro lie Cys 

770 775 780 

Leu Asp Glu Ser Ala Leu Phe Glu Gin Tyr Met Pro Phe Met Lys Leu 
785 790 795 800 

Gin Phe Val Tyr Ala His Gin Glu Gly Phe Lys Glu Gin Gly Thr Glu 

805 810 815 

Ala Arg Glu Phe Gly Ser Ser Arg Leu Val Asn Leu Ala Leu Pro lie 

820 825 830 

Gly lie Arg Phe Asp Lys Glu Ser Asp Cys Gin Asp Ala Thr Tyr Asn 

"835- "840 " " 845 

Leu Thr Leu Gly Tyr Thr Val Asp Leu Val Arg Ser Asn Pro Asp Cys 

850 855 860 

Thr Thr Thr Leu Arg lie Ser Gly Asp Ser Trp Lys Thr Phe Gly Thr 
865 870 875 880 

Asn Leu Ala Arg Gin Ala Leu Val Leu Arg Ala Gly Asn His Phe Cys 

885 890 895 

Phe Asn Ser Asn Phe Glu Ala Phe Ser Gin Phe Ser Phe Glu Leu Arg 

900 905 910 

Gly Ser Ser Arg Asn Tyr Asn Val Asp Leu Gly Ala Lys Tyr Gin Phe 
915 920 925 



(2) INFORMATION FOR SEQ ID NO: 11: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 757 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Genomic DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

ATGAGATCGT CTTTTTCCTT GTTATTAATA TCTTCATCTC TAGCCTTTCC TCTCTTAATG 60 

AGTGTTTCTG CAGATGCTGC CGATCTCACA TTAGGGAGTC GTGACAGTTA TAATGGTGAT 120 

ACAAGCACCA CAGAATTTAC TCCTAAAGCG GCAACTTCTG ATGCTAGTGG CACGACCTAT 180 

ATT CTCGATG GGGATGTCTC GATAAGCCAA GCAGGGAAAC AAACG AG CTT AACCACAAGT 240 

TGTTTTTCTA ACACTGCAGG AAATCTTACC TTCTTAGGGA ACGGATTTTC TCTTCATTTT 300 

GACAATATTA TTTCGTCTAC TGTTGCAGGT GTTGTTGTTA G CAAT AC AG C AGCTTCTGGG 360 

ATTACGAAAT TCTCAGGATT TTCAACTCTT CGGATG CTTG CAGCTCCTAG GACCACAGGT 420 

AAAGG AG C C A TTAAAATTAC CGATGGTCTG GTGTTTGAGA GTATAGGGAA TCTTGACCAA 4 80 

AATGAAAATG CCTCTAGTGA AAATGGGGGA GCCATCAATA CGAAGACTTT GTCTTTGACT 540 

GGGAGTACGC GGTTTGTAGC GTTCCTTGGC AATAGCTCGT CGCAACAAGG GGGAG CGATC 600 

TATGCTTCTG GTGACTCTGT GATTTCTGAG AATGCAGGAA T CTTG AG CTT CGGAAACAAC 660 

AGTGCGACAA CATCAGGAGG CGCGATCTCT GCTGAAGGGA ACCTTGTGAT CTCCAATAAC 720 

CAAAATATCT TTTT CGATGG CTGCAAAGCA ACTACAAATG GCGGAGCTAT TGATTGTAAC 780 

AAAGCAGGGG CGAACCCAGA CCCTATCTTG ACT CTTT CAG GAAATGAGAG CCTGCATTTT 84 0 

CTGAATAACA CAG C AGG AAA TAGTGGAGGT GCGATTTATA CCAAAAAATT GGTGTTATCC 900 

TCAGGACGAG GAGGAGTGTT ATTTT CTAAC AACAAAGCTG CGAATGCTAC TCCTAAAGGA 960 
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GGGGCAATTG CGATTCTAGA TTCTGGAGAG ATTAGCATTT CTGCAGATCT CGGCAATATC 102 0 

ATTTTCGAGG G CAATACT AC GAG CACTAC A GGAAGTCCTG CGAGTGTGAC CAGAAATGCT 1080 

ATAGAT CTTG CATCGAATGC AAAATTTTTA AATCTCCGAG CGACTCGGGG AAATAAAGTT 1140 

ATTTTCTATG ATCCTAT CAC GAGCTCAGGA GCTACTGATA AGCTCTCTTT GAATAAAGCT 1200 

GACGCAGGAT CTGGAAATAC CTATGAAGGC TACATCGTTT TCTCTGGAGA GAAACTCTCA 1260 

GAAGAGGAAC TTAAGAAACC TGACAATCTG AAGTCTACAT TTACACAGGC TGTAGAGCTT 1320 

GCTGCAGGTG CCTTAGTATT GAAAGATGGA GTGACTGTAG TTGCAAATAC TATAACGCAG 13 80 

GTCGAGGGAT CGAAAGT CGT TATGGATGGA GGGACTACTT TTGAGGCAAG CGCTGAGGGG 1440 

GTCACTCTCA ATGGCCTAGC CATTAATATA GATTCCTTAG ATGGGACAAA TAAAGCTATC 1500 

ATTAAGGCGA CGGCAGCAAG TAAGGATGTT GCCTTATCAG GGCCTATCAT GCTTGTAGAT 1560 

GCT CAGGGG A ACTATTATGA GCATCATAAT CTCAGTCAAC AGCAGGTCTT TCCTTTAATA 1620 

GAGCTTTCTG CACAAGGAAC GATGACTACT ACAGATATCC CCGATACCCC AATTCTAAAT 1680 

ACTACGAATC ACTATGGGTA TCAAGGAACT GGAATAATTG TTTGGGTCGA CGATGCAACT 1740 

GCAAAAACAA AAAATGCTAC CTTAACTTGG ACTAAAACAG GATACAAGCC GAATC CAGAA 1800 

CGTCAGGGAC CTTTGGTTCC TAATAGCCTG TGGGGTTCTT TTGTCGATGT CCGCTCCATT 1860 

CAGAGCCTCA TGGACCGGAG CACAAGTTCG TTATCTTCGT CAACAAATTT GTGGGTATCA 1920 

GGAATCGCGG ACTTTTTGCA TGAAGATCAG AAAGGAAACC AACGTAGTTA TCGTCATTCT 1980 

AGCGCGGGTT ATGCATTAGG AGGAGGATTC TTCACGGCTT CTGAAAATTT CTTTAATTTT 2040 

GCTTTTTGTC AGCTTTTTGG CTACGACAAG GACCATCTTG TGGCTAAGAA CCATACCCAT 2100 

GTATATGCAG GGGCAATGAG TTAC CGACAC CTCGGAGAGT CTAAGACCCT CGCTAAGATT 2160 

TTGTCAGGAA ATTCTGACTC CCTACCTTTT GTCTTCAATG CTCGGTTTGC TTATGG CCAT 2220 

ACCGACAATA ACATGACCAC AAAGTACACT GGCTATTCTC CTGTTAAGGG AAG CTGGGG A 2280 

AATGATGCCT TCGGTATAGA ATGTGGAGGA GCTATCCCGG TAGTTGCTTC AGGACGTCGG 2340 

TCTTGGGTGG ATACCCACAC GCCATTTCTA AACCTAGAGA TGATCTATGC ACATCAGAAT 2400 

GACTTTAAGG AAAACGG CAC AGAAGGCCGT TCTTTC CAAA GTGAAGACCT CTTCAATCTA 2460 

GCGGTT CCTG TAGGGATAAA ATTTGAGAAA TTCTCCGATA AGTCTACGTA TGATCTCTCC 2520 

ATAGCTTACG TTCCCGATGT GATTCGTAAT GATCCAGGCT GCACGACAAC TCTTATGGTT 2580 

TCTGGGGATT CTTGGTCGAC ATGTGGTACA AG CTTGTCT A GACAAGCTCT TCTTGTACGT 2640 

GCTGGAAAT C ATCATGCCTT TGCTTCAAAC TTTGAAGTTT TCAGTCAGTT TGAAGTCGAG 2 700 

TTGCGAGGTT CTTCTCGTAG CTATGCTATC GATCTTGGAG GAAGATTCGG ATTTTAA 2757 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 918 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



Met 


Arg 


Ser 


Ser 


Phe 


Ser 


Leu 


Leu 


Leu 


He 


Ser 


Ser 


Ser 


Leu 


Ala 


Phe 


1 








5 










10 










15 




Pro 


Leu 


Leu 


Met 


Ser 


Val 


Ser 


Ala 


Asp 


Ala 


Ala 


Asp 


Leu 


Thr 


Leu 


Gly 


Ser 






20 










25 










30 




Arg 


Asp 
35 


Ser 


Tyr 


Asn 


Gly 


Asp 
40 


Thr 


Ser 


Thr 


Thr 


Glu 
45 


Phe 


Thr 


Pro 


Lys 


Ala 


Ala 


Thr 


Ser 


Asp 


Ala 


Ser 


Gly 


Thr 


Thr 


Tyr 


He 


Leu 


Asp 


Gly 




50 










55 










60 






Asp 


Val 


Ser 


He 


Ser 


Gin 


Ala 


Gly 


Lys 


Gin 


Thr 


Ser 


Leu 


Thr 


Thr 


Ser 












70 










75 










80 


Cys 


Phe 


Ser 


Asn 


Thr 


Ala 


Gly 


Asn 


Leu 


Thr 


Phe 


Leu 


Gly 


Asn 


Gly 


Phe 


Ser 








85 










90 










95 




Leu 


His 


Phe 
100 


Asp 


Asn 


He 


He 


Ser 
105 


Ser 


Thr 


Val 


Ala 


Gly 
110 


Val 


Val 
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Val Ser Asn Thr 
115 

Thr Leu Arg Met 
130 

Lys lie Thr Asp 
145 

Asn Glu Asn Ala 

Leu Ser Leu Thr 
180 

Ser Ser Gin Gin 
195 

Ser Glu Asn Ala 
210 

Ser Gly Gly Ala 
225 

Gin Asn lie Phe 

lie Asp Cys Asn 
260 

Ser Gly Asn Glu 
275 

Gly Gly Ala lie 
290 

Gly Val Leu Phe 
305 

Gly Ala lie Ala 

Leu Gly Asn lie 
340 

Pro Ala Ser Val 
355 

Phe Leu Asn Leu 
370 

Pro lie Thr Ser 
385 

Asp Ala Gly Ser 

Glu Lys Leu Ser 
420 

Thr Phe Thr Gin 
435 

Asp Gly Val Thr 
450 

Lys Val Val Met 
465 

Val Thr Leu Asn 

Asn Lys Ala lie 
500 

Ser Gly Pro lie 
515 

His Asn Leu Ser 
530 

Gin Gly Thr Met 
545 

Thr Thr Asn His 



Ala Ala Ser Gly 
120 

Leu Ala Ala Pro 
13 5 

Gly Leu Val Phe 
150 

Ser Ser Glu Asn 
165 

Gly Ser Thr Arg 

Gly Gly Ala lie 
200 

Gly lie Leu Ser 
215 

lie Ser Ala Glu 
230 

Phe Asp Gly Cys 

Lys Ala Gly Ala 

Ser Leu His Phe 
280 

Tyr Thr Lys Lys 
295 

Ser Asn Asn Lys 
310 

lie Leu Asp Ser 
325 

lie Phe Glu Gly 

Thr Arg Asn Ala 
360 

Arg Ala Thr Arg 
375 

Ser Gly Ala Thr 
390 

Gly Asn Thr Tyr 
405 

Glu Glu Glu Leu 

Ala Val Glu Leu 
440 

Val Val Ala Asn 
455 

Asp Gly Gly Thr 
470 

Gly Leu Ala lie 
485 

lie Lys Ala Thr 

Met Leu Val Asp 
520 

Gin Gin Gin Val 
535 

Thr Thr Thr Asp 
550 

Tyr Gly Tyr Gin 



57 

lie Thr Lys Phe 

Arg Thr Thr' Gly 
140 

Glu Ser lie Gly 
155 

Gly Gly Ala He 
170 

Phe Val Ala Phe 
185 

Tyr Ala Ser Gly 

Phe Gly Asn Asn 
220 

Gly Asn Leu Val 
235 

Lys Ala Thr Thr 
250" 

Asn Pro Asp Pro 
265 

Leu Asn Asn Thr 

Leu Val Leu Ser 
300 

Ala Ala Asn Ala 
315 

Gly Glu He Ser 
330 

Asn Thr Thr Ser 
345 

He Asp Leu Ala 

Gly Asn Lys Val 
380 

Asp Lys Leu Ser 
395 

Glu Gly Tyr He 
410 

Lys Lys Pro Asp 
425 

Ala Ala Gly Ala 

Thr He Thr Gin 
460 

Thr Phe Glu Ala 
475 

Asn He Asp Ser 
490 

Ala Ala Ser Lys 
505 

Ala Gin Gly Asn 

Phe Pro Leu He 
540 

lie Pro Asp Thr 
555 

Gly Thr Gly He 



Ser Gly Phe Ser 
125 

Lys Gly Ala He 

Asn Leu Asp Gin 
160 

Asn Thr Lys Thr 
175 

Leu Gly Asn Ser 
190 

Asp Ser Val He 
205 

Ser Ala Thr Thr 

He Ser Asn Asn 
240 

Asn Gly Gly Ala 
255' 

He Leu Thr Leu 
270 

Ala Gly Asn Ser 
285 

Ser Gly Arg Gly 

Thr Pro Lys Gly 
320 

He Ser Ala Asp 
335 

Thr Thr Gly Ser 
350 

Ser Asn Ala Lys 
365 

He Phe Tyr Asp 

Leu Asn Lys Ala 
400 

Val Phe Ser Gly 
415 

Asn Leu Lys Ser 
430 

Leu Val Leu Lys 
445 

Val Glu Gly Ser 

Ser Ala Glu Gly 
480 

Leu Asp Gly Thr 
495 

Asp Val Ala Leu 
510 

Tyr Tyr Glu His 
525 

Glu Leu Ser Ala 

Pro He Leu Asn 
560 

He Val Trp Val 
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565 57 ° 575 

Asp Asp Ala Thr Ala Lys Thr Lys Asn Ala Thr Leu Thr Trp Thr Lys 

585 590 
Thr Gly Tyr Lys Pro Asn Pro Glu Arg Gin Gly Pro Leu Val Pro Asn 

595 600 60s 

Ser Leu Trp Gly Ser Phe Val Asp Val Arg Ser He Gin Ser Leu Met 

610 615 6 2o 

Asp Arg Ser Thr Ser Ser Leu Ser Ser Ser Thr Asn Leu Trp Val Ser 
n 630 635 640 

Gly He Ala Asp Phe Leu His Glu Asp Gin Lys Gly Asn Gin Arg ser 

645 650 655 

Tyr Arg His Ser Ser Ala Gly Tyr Ala Leu Gly Gly Gly Phe Phe Thr 
660 665 670 



Ala Ser Glu Asn Phe Phe Asn Phe Ala Phe Cys Gin Leu Phe Gly Tyr 

675 680 685 

Asp Lys Asp His Leu Val Ala Lys Asn His Thr His Val Tyr Ala Gly 
690 695 



700 



Ala Met Ser Tyr Arg His Leu Gly Glu Ser Lys Thr Leu Ala Lys He 

r 710 71 5 720 

Leu Ser Gly Asn Ser Asp Ser Leu Pro Phe Val Phe Asn Ala Arg Phe 

725 ? 30 735 

Ala Tyr Gly His Thr Asp Asn Asn Met Thr Thr Lys Tyr Thr Gly Tyr 

740 745 750 

Ser Pro Val Lys Gly Ser Trp Gly Asn Asp Ala Phe Gly He Glu Cys 

Gly Gly Ala lie Pro Val Val Ala Ser Gly Arg Arg Ser Trp Val Asp 

775 780 
Thr His Thr Pro Phe Leu Asn Leu Glu Met He Tyr Ala His Gin Asn 

790 795 80Q 

Asp Phe Lys Glu Asn Gly Thr Glu Gly Arg Ser Phe Gin Ser Glu Asp 

Leu Phe Asn Leu Ala Val Pro Val Gly He Lys Phe Glu Lys Phe Ser 

820 825 830 

Asp Lys Ser Thr Tyr Asp Leu Ser He Ala Tyr Val Pro Asp Val lie 

835 840 845 

Arg Asn Asp Pro Gly Cys Thr Thr Thr Leu Met Val Ser Gly Asp Ser 

^ 855 
Trp Ser Thr Cys Gly Thr Ser Leu Ser Arg Gin 111 Leu Leu Val Arg 

870 875 880 

Ala Gly Asn His His Ala Phe Ala Ser Asn Phe Glu Val Phe Ser Gin 

Phe Glu Val Glu Leu Arg Gly Ser Ser Irg Ser Tyr Ala He ill Leu 

900 905 910 

Gly Gly Arg Phe Gly Phe 
915 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 787 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 13: 
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ATGAAATCCT CTCTTCATTG GTTTGTAATC TCGTCATCTT TAGCACTTCC CTTGT CACTA 60 

AATTTCT CTG CGTTTGCTGC TGTTGTTGAA ATCAATCTAG GACCTACCAA TAGCTTCTCT 120 

GGACCAGGAA CCTACACTCC TCCAGCCCAA ACAACAAATG CAGATGGAAC TATCTATAAT 180 

CTAACAGGGG ATGT CTCAAT CACCAATGCA GGATCTCCGA CAGCTCTAAC CGCTTCCTGC 24 0 

TTTAAAGAAA CTACTGGGAA TCTTTCTTTC CAAGGCCACG GCTACCAATT TCTCCTACAA 300 

AAT AT CGATG CGGGAGCGAA CTGTACCTTT ACCAATACAG CTGCAAATAA GCTTCTCTCC 360 

TTTTCAGGAT TCTCCTATTT GTCACTAATA CAAACCACGA ATGCTACCAC AGGAACAGGA 420 

GCCATCAAGT CCACAGGAGC TTGTTCTATT CAGTCGAACT ATAGTTG CTA CTTTGGCCAA . 4 80 

AACTTTTCTA ATGACAATGG AGGCGCCCTC CAAGGCAGCT CTAT CAGTCT ATCGCTAAAC 540 

CCCAACCTAA CGTTTGCCAA AAACAAAGCA ACG CAAAAAG GGGGTGCCCT CTATTCCACG 600 

GGAGGGATTA CAATTAACAA TACGTTAAAC TCAGCATCAT TTTCTGAAAA TACCGCGGCG 660 

AACAATGGCG GAGCCATTTA CACGGAAGCT AGCAGTTTTA TTAGCAGCAA CAAAGCAATT 720 

AG CTTTATAA ACAATAGTGT GACCGCAACC TCAG CTACAG GGGGAGCCAT TTACTGTAGT 780 

AGTACATCAG CCCCCAAACC AGT CTTAACT CTATCAGACA ACGGGGAACT GAACTTTATA 840 

GGAAATACAG CAATTACTAG TGGTGGGGCG ATTTATACTG ACAATCTAGT TCTTTCTTCT 900 

GGAGG AC CTA CGCTTTTTAA AAACAACTCT G CTATAGAT A CTGCAGCTCC CTTAGGAGGA 960 

GCAATTGCGA TTGCTGACTC TGGATCTTTG AGTCTTTCGG CTCTTGGTGG AGACATCACT 1020 

~TTTGAAGGAA~~ACACAGTAGT~CAAAGGAGCT~TCTTCGAGTC~A TOW 

ATTAACATCG GAAACAC CAA TGCTAAGATT GTACAGCTGC GAGCCTCTCA AGGCAATACT 114 0 

ATCTACTTCT ATGATCCTAT AACAACTAAC CATACTGCAG CTCTCTCAGA TGCTCTAAAC 1200 

TTAAATGGTC CTGACCTTGC AGGGAATCCT GCATAT CAAG G AAC CAT CGT ATTTT CTGGA 12 60 

GAGAAGCTCT CGGAAG CAGA AG CTG C AG AA GCTGATAATC TCAAATCTAC AATTCAGCAA 1320 

CCTCTAACTC TTGCGGGAGG GCAACTCTCT CTTAAATCAG GAGTCACTCT AGTTG CTAAG 13 80 

TCCTTTTCGC AATCTCCGGG CTCTACCCTC CTCATGGATG CAGGGACCAC ATTAGAAACC 144 0 

GCTGATGGGA TCACTATCAA TAAT CTTGTT CTCAATGTAG ATTC CTT AAA AGAGACCAAG 1500 

AAGGCTACGC TAAAAGCAAC AC AAG CAAGT CAGACAGTCA CTTTATCTGG ATCGCTCTCT 1560 

CTTGTAG AT C CTTCTGGAAA TGTCTACGAA GATGTCTCTT GGAATAACCC TCAAGTCTTT 1620 

TCTTGTCTCA CTCTTACTGC TGACGACCCC G CG AAT ATT C ACATCACAGA CTTAGCTGCT 1680 

GATCCCCTAG AAAAAAATCC TAT CCATTGG GGATAC CAAG GGAATTGGGC ATTAT CTTGG 1740 

CAAGAGGATA CTGCGACTAA ATCCAAAGCA GCGACT CTT A C CTGG AC AAA AACAGGATAC 1800 

AAT C CG AAT C CTGAGCGTCG TGGAACCTTA GTTGCTAACA CGCTATGGGG ATCCTTTGTT 1860 

GATGTGCGCT C CAT ACAACA G CTTGTAG C C ACTAAAGTAC GCCAATCTCA AGAAACTCGC 192 0 

GGCATCTGGT GTGAAGGGAT CTCGAACTTC TTC CATAAAG AT AG CACGAA GATAAATAAA 1980 

GGTTTTCGCC ACATAAGTGC AGGTTATGTT GTAGGAGCGA CTACAACATT AGCTTCTGAT 2 04 0 

AAT CTT ATC A CTG CAG C CTT CTGCCAATTA TTCGGGAAAG ATAGAGATCA CTTTATAAAT 2100 

AAAAATAGAG CTTCTGCCTA TG CAG CTT CT CTCCATCTCC AGCATCTAGC GACCTTGTCT 216 0 

TCTCCAAGCT TGTTACGCTA CCTTCCTGGA TCTGAAAGTG AGCAGCCTGT CCTCTTTGAT 222 0 

G CT CAG AT C A G CTAT AT CTA TAGTAAAAAT ACTATGAAAA CCTATTACAC CCAAGCACCA 22 80 

AAGGGAGAGA GCTCGTGGTA TAATGACGGT TGCGCTCTGG AACTTGCGAG CTCCCTACCA 2340 

C AC ACT G CTT TAAG CCATGA GGGTCTCTTC CACGCGTATT TTCCTTTCAT CAAAGTAGAA 2400 

GCTTCGTACA TACACCAAGA TAGCTTCAAA GAACGTAATA CTACCTTGGT ACGATCTTTC 24 60 

GATAGCGGTG ATTTAATTAA CGTCTCTGTG CCTATTGGAA TTACCTTCGA GAGATT CT CG 2 520 

AGAAACGAG C GTGCGTCTTA CGAAG CTACT GTCATCTACG TTG C CGATGT CTATCGTAAG 2 580 

AATCCTGACT GCACGACAGC TCTCCTAATC AACAATACCT CGTGGAAAAC TACAGGAACG 2 640 

AAT CT CTCAA G AC AAG CTGG TATCGGAAGA GCAGGGATCT TTTATGCCTT CTCTCCAAAT 2 700 

CTTGAGGTCA CAAGTAACCT ATCTATGGAA ATTCGTGGAT CTT C ACG CAG CTACAATGCA 2 760 

GAT CTTGGAG GTAAGTTCCA GTTCTAA 2 787 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 92 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 14 : 

Met Lys Ser Ser Leu His Trp Phe Val He Ser Ser Ser Leu Ala Leu 
' 1 5 io 15 

Pro Leu Ser Leu Asn Phe Ser Ala Phe Ala Ala Val Val Glu He Asn 

20 25 30 

Leu Gly Pro Thr Asn Ser Phe Ser Gly Pro Gly Thr Tyr Thr Pro Pro 

35 40 45 

Ala Gin Thr Thr Asn Ala Asp Gly Thr He Tyr Asn Leu Thr Gly Asp 

50 55 * 60 

Val Ser He Thr Asn Ala Gly Ser Pro Thr Ala Leu Thr Ala Ser Cys 
65 70 75 80 

Phe Lys Glu Thr Thr Gly Asn Leu Ser Phe Gin Gly His Gly Tyr Gin 

85 90 95 

Phe Leu Leu Gin Asn He Asp Ala Gly Ala Asn Cys Thr Phe Thr Asn 

100 105 HO 

Thr Ala Ala Asn Lys Leu Leu Ser Phe Ser Gly Phe Ser Tyr Leu Ser 

115 120 125 

Leu He Gin Thr Thr Asn Ala Thr Thr Gly Thr Gly Ala He Lys Ser 

130 135 140 

Thr Gly Ala Cys Ser He Gin Ser Asn Tyr Ser Cys Tyr Phe Gly Gin 
145 150 155 160 

Asn Phe Ser Asn Asp Asn Gly Gly Ala Leu Gin Gly Ser Ser He Ser 

165 170 175 

Leu Ser Leu Asn Pro Asn Leu Thr Phe Ala Lys Asn Lys Ala Thr Gin 

180 185 190 

Lys Gly Gly Ala Leu Tyr Ser Thr Gly Gly lie Thr He Asn Asn Thr 

19 5 200 205 

Leu Asn Ser Ala Ser Phe Ser Glu Asn Thr Ala Ala Asn Asn Gly Gly 

210 215 220 

Ala He Tyr Thr Glu Ala Ser Ser Phe He Ser Ser Asn Lys Ala He 
225 230 235 240 

Ser Phe He Asn Asn Ser Val Thr Ala Thr Ser Ala Thr Gly Gly Ala 

245 250 255 

He Tyr Cys Ser Ser Thr Ser Ala Pro Lys Pro Val Leu Thr Leu Ser 

260 265 270 

Asp Asn Gly Glu Leu Asn Phe He Gly Asn Thr Ala He Thr Ser Gly 

275 280 285 

Gly Ala He Tyr Thr Asp Asn Leu Val Leu Ser Ser Gly Gly Pro Thr 

29 0 295 300 

Leu Phe Lys Asn Asn Ser Ala He Asp Thr Ala Ala Pro Leu Gly Gly 
305 310 315 320 

Ala He Ala He Ala Asp Ser Gly Ser Leu Ser Leu Ser Ala Leu Gly 

325 330 335 

Gly Asp He Thr Phe Glu Gly Asn Thr Val Val Lys Gly Ala Ser Ser 

340 345 350 

Ser Gin Thr Thr Thr Arg Asn Ser lie Asn He Gly Asn Thr Asn Ala 

355 360 365 

Lys He Val Gin Leu Arg Ala Ser Gin Gly Asn Thr He Tyr Phe Tyr 

370 375 380 

Asp Pro He Thr Thr Asn His Thr Ala Ala Leu . Ser Asp Ala Leu Asn 
385 390 395 400 

Leu Asn Gly Pro Asp Leu Ala Gly Asn Pro Ala Tyr Gin Gly Thr He 

405 410 415 

Val Phe Ser Gly Glu Lys Leu Ser Glu Ala Glu Ala Ala Glu Ala Asp 

, 42 0 425 430 

Asn Leu Lys Ser Thr He Gin Gin Pro Leu Thr Leu Ala Gly Gly Gin 
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435 










440 










445 








Leu 


Ser 


Leu 


Lys 


Ser 


Gly Val 


Thr 


Leu 


Val 


Ala 


Lys 


Ser 


Phe 


Ser 


Gin 




450 










455 










460 










Ser 


Pro 


Gly 


Ser 


Thr 


Leu 


Leu 


Met 


Asp 


Ala 


Gly Thr 


Thr 


Leu 




X iXX 


465 










470 










475 










480 


Ala 


Asp 


Gly 


He 


Thr 


He 


Asn 


Asn 


Leu 


Val 


Leu 


Asn 


Val 


Son 


Cot* 

OCX 


Leu 










485 










490 










4 95 




Lvs 


Glu 


Thr 


Lvs 


Lvs 


Ala 


Thr 


Leu 


Lys 


Ala 


Thr 


Gin 


Ala 


OCX 


vjrxxi 


Th-r 

X I XX 








500 










505 










510 






Val 


Thr 


Leu 


Ser 


Gly 


Ser 


Leu 


Ser 


Leu 


Val 


Asp 


Pro 


Ser 


Gly 


Asn 


Val 






515 










520 










525 








TV it 


Glu 


Asp 


Val 


Ser 


Trp Asn 


Asn 


Pro 


Gin 


Val 


Phe 


Q O V 
OCX 


Lys 


Leu 


j. nr 




530 










c-i c 










540 










Leu 


Thr 


Ala 


Asp 


Asp 


Pro 


Ala 


Asn 


Tip 
X X c 


nx s 


He 


Thr 


Asp 


Leu 


Ala 


Ala 


545 










c c n 










555 












Asp 


Pro 


Leu 


Glu 


Lys 


Asn 


Pro 


T "| o 

x x c 


rix s 


Trp 


Gly Tyr 


pi-, 

tjin 


oiy 


Asn 


Trp 










565 










R70 










c *7 c 




"AlcT 


Leu 


Ser 




Gin 


<j1u 


Asp 


X XIX 


rVXa 


X 1XX 


Lys 


Ser 


Lys 






Thr 








580 










585 










_> _7 u 






Leu 


Thr 


Trn 


Thr 


Lys 


Thr 


Gly 


i yi 


A. en 


XT X t_> 


Asn 


Pro 


bill 


Arg 


Arg 


pi 
(jiy 






595 










600 










6 05 








Tlir 


Leu 


Val 


Ala 


Asn 


Thr 


Leu 


i rp 


bXy 


Ser 


Phe 


Val 


Asp 


vai 


Arg 


Ser 




610 










615 










620 










lie 


Gin 


Gin 


Leu 


Val 


Ala 


Thr 


T A^G 

j_i y o 


vox 


Arg 


Gin 


Ser 


Lixil 


ulu 


i nr 


Arg 


625 










ojU 










635 












Glv 


lie 




Cvs 


Glu 


vjly 


lie 


Q q -y— 
OCX. 


Asn 


DViO 


Phe 


His 


Lys 


Asp 


Ser 


inr 










645 










ccn 

OjU 










f cc 
033 




Lys 


lie 


Asn 


Lys 


Gly 


rxie 


Arg 


nx o 


Tip 
X J.C 


Q O T~ 


Ala 


Gly 


Tyr 


V ai 


Val 


uiy 








660 










6 65 










7 n 






Ala 


Thr 


Thr 


Thr 


Leu 


Ala 


Ser 




Asn 


Leu 


He 


Thr 


Ala 


ai a 


irne 


Cys 






675 










6 80 










DOj 






Gin 


Leu 


Phe 


Gly 


Lys 


Asp 


Arg 


Ren 


n x o 


DVio 


He 


Asn 


Lys 


Asn 


Arg 


Aia 




690 










695 










700 










Ser 


Ala 


Tvr 


Ala 


Ala 


Ser 


Leu 




Leu 


m ti 

V7XXX 


His 


Leu 


Ala 


i nr 


Leu 


Ser 


705 










710 










715 










/ £ U 


Ser 


Pro 


Ser 


Leu 


Leu 


Arg 


Tyr 


lieu 


riO 


Lj X y 


Ser 


Glu 


Ser 




bin 


Pro 










725 










730 










Tic 




Val 


Leu 


Phe 


Asp 


Ala 


Gin 


He 


Ser 


yx 


X x c 


Tyr 


Ser 


Lys 


Asn 


l nr 


Mot- 
Met 








740 










745 










/ -J w 






Lvs 


Thr 


Tvr 


Tvr 


Thr 


Gin 


Ala 


DrA 
nu 


Lys 


vjx y 


Glu 


Ser 


Cot- 
OCX 


Trp 


Tyr 


Asn 






755 










760 










7fi R 








Asp 


Glv 

vjx y 


Cys 


Ala 


Leu 


Glu 


Leu 




Cor- 

OCX 


Cot- 
ocx 


Leu 


Pro 


nxS 


inr 


TV 1 a 

Aia 


Leu 




770 










775 










780 










Ser 


His 


Glu 


Glv 


Leu 


Phe 


His 


Ala 


±yr 


D*ho 
iriic 


Pro 


Phe 


lie 


Lys 


v ai 


olU 


785 










790 










795 










ft o n 


Ala 


Ser 


Tvr 


He 


His 


Gin 


Asp 


Ser 


XT 1IC 


Xjy s> 


Glu Arg 


Asn 


t* 

i nr 


T -w~ 

i nr 


Leu 










805 










810 










815 




Val 


Arcr 


Ser 


Phe 


Asp 


Ser Gly 


Asp 


XJCIX 


Tip 

x xc 


Ash 


Val 


Cot* 

OCX 


Val 


Pro 


T 1 o 

ne 








820 










825 










830 






Gly 


He 


Thr 


Phe 


Glu Arg 


Phe 


Ser 




A en 


Glu 


Arg 


A "I a 

rila 


Cot* 


Tyr 


uiu 






835 










840 










845 








Ala 


Thr 


Val 


He 


Tyr 


Val 


Ala 


Asp 


Val 


Tyr 


Arg 


Lys 


Asn 


Pro 


Asp 


Cys 




850 










855 










860 










Thr 


Thr 


Ala 


Leu 


Leu 


lie 


Asn 


Asn 


Thr 


Ser 


Trp 


Lys 


Thr 


Thr 


Gly 


Thr 


865 










870 










875 








880 


Asn 


Leu 


Ser 


Arg 


Gin 


Ala 


Gly 


He 


Gly 


Arg 


Ala Gly 


He 


Phe 


Tyr 


Ala 










885 










890 










895 
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62 



Phe 



Ser Pro Asn Leu Glu Val Thr Ser 
900 905 



Asn Leu Ser Met Glu lie Arg 
910 



Gly Ser Ser Arg Ser Tyr Asn Ala Asp Leu Gly Gly Lys Phe Gin Phe 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 793 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDN ESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

ATGAAAATAC CCTTGCACAA ACTCCTGATC TCTTCGACTC TTGTCACTCC CATT CTATTG 
AGCATTGCAA CTTACGGAGC AGATGCTTCT TTATCCCCTA CAGATAG CTT TGATGGAGCG 
GGCGGCTCTA CATTTACTCC AAAATCTACA GCAGATGCCA ATGGAACGAA CTATGTCTTA 
TCAGGAAATG T CT AT AT AAA CGATGCTGGG AAAGGCACAG CATTAACAGG CTGCTGCTTT 
ACAGAAACTA CGGGTGATCT GACATTTACT GGAAAGGGAT ACTCATTTTC ATTCAACACG 
GTAGATGCGG GTTCGAATGC AGGAGCTGCG GCAAGCACAA CTGCTGATAA AGCCCTAACA 
TTCACAGGAT TTTCTAACCT TT C CTT CATT GCAGCTCCTG GAACTACAGT TGCTT CAGGA 
AAAAGTACTT TAAGTTCTGC AGGAG CCTTA AATCTTACCG ATAATGGAAC GATTCT CTTT 
AG C CAAAACG TCTCCAATGA AGCTAATAAC AATGGCGGAG CGATCACCAC AAAAACTCTT 
TCTATTTCTG GGAATACCTC TTCTATAACC TTCACTAGTA ATAG CGCAAA AAAATTAGGT 
GGAGCGATCT ATAGCTCTGC GGCTGCAAGT ATTTCAGGAA ACACCGGCCA GTTAGTCTTT 
ATGAATAATA AAGGAGAAAC TGGGGGCGGG GCTCTGGGCT TTGAAGCCAG CTCCTCGATT 
ACT CAAAAT A GCTCCCTTTT CTTCTCTGGA AACACTGCAA CAGATGCTGC AGGCAAGGGC 
GGGGCCATTT ATTGTGAAAA AACAGGAGAG ACTCCTACTC TTACTATCTC TGGAAATAAA 
AGTCTGACCT TCGCCGAGAA CTCTTCAGTA ACT CAAGG C G GAGCAATCTG TGCCCATGGT 
CTAGAT CTTT CCGCTGCTGG CCCTACCCTA TTTTCAAATA ATAGATGCGG GAACACAGCT 
GCAGGCAAGG GCGGCGCTAT TGCAATTGCC GACTCTGGAT CTTTAAGTCT CTCTGCAAAT 
CAAGGAGACA TCACGTTCCT TGG CAACACT CTAACCTCAA CCTCCGCGCC AACATCGACA 
CGGAATGCTA TCTACCTGGG ATCGTCAGCA AAAATTACGA ACTTAAGGGC AG CCCAAGG C 
CAATCTATCT ATTTCTATGA TCCGATTGCA TCTAACACCA CAGGAGCTTC AGACGTTCTG 
ACCATCAACC AACCGGATAG CAACTCGCCT TTAGATTATT CAGGAACGAT TGTATTTTCT 
GGGGAAAAGC T CTCTG C AG A TGAAGCGAAA GCTG CTGATA ACTT CACATC TATATTAAAG 
CAAC CATTGG CTCTAGCCTC TGGAACCTTA GCACTCAAAG G AAATGT CG A GTTAGATGTC 
AATGGTTTCA CACAGACTGA AGGCTCTACA CTCCTCATGC AACCAGGAAC AAAGCTCAAA 
GCAGATACTG AAGCTATCAG TCTTACCAAA CTTGT CGTTG ATCTTTCTGC CTTAGAGGGA 
AATAAGAGTG TGTCCATTGA AAC AG CAGGA GCCAACAAAA CTATAACTCT AACCTCTCCT 
CTTGTTTTCC AAGATAGTAG CGGCAATTTT TATGAAAGCC ATACGATAAA CCAAGCCTTC 
ACGCAGCCTT TGGTGGTATT CACTG CTGCT ACTGCTGCTA GCGATATTTA TATCGATGCG 
CTT CTC ACTT CTCCAGTACA AACTCCAGAA CCTCATTACG GGTATCAGGG ACATTGGGAA 
GCCACTTGGG CAGACACATC AACTG CAAAA TCAGGAACTA TGACTTGGGT AACTACGGGC 
TACAACCCTA ATCCTGAGCG TAGAGCTTCC GTAGTTCCCG ATTCATTATG GGCATCCTTT 
ACTG AC ATT C GCACTCTACA GCAGATCATG ACATCTCAAG CGAATAGTAT CTATCAGCAA 
CGAGGACT CT GGGCATCAGG AACTG CG AAT TTCTTCCAT A AGGATAAATC AGGAACTAAC 
CAAGCATTCC GACATAAAAG CTACGG CTAT ATTGTTGGAG GAAGTGCTGA AG ATTTTT CT 
GAAAATATCT TCAGTGTAGC TTTCTGCCAG CTCTTCGGTA AAGATAAAGA CCTGTTTATA 
GTTGAAAATA CCTCTCATAA CTATTTAGCG TCGCTATACC TGCAACATCG AGCATTCCTA 
GGAGGACTTC CCATGCCCTC ATTTGGAAGT AT C AC CGAC A TGCTGAAAGA TATTCCTCTC 
ATTTTGAATG CCCAGCTAAG CTACAGCTAC ACTAAAAATG ATATGGATAC TCGCTATACT 
TCCTATCCTG AAGCT CAAGG TTCTTGGACC AATAATTCTG GGG CTCTAGA GCTCGGAGGA 
TCTCTGGCTC TATATCTCCC TAAAGAAGCA CCGTTCTTCC AGGGATATTT CCCCTTCTTA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 



915 



920 



925 



SUBSTITUTE SHEET (RULE 26) 



WO 98/58953 



PCT7DK98/00266 



63 

AAGTTCCAGG CAGTCTACAG CCGCCAACAA AACTTTAAAG AGAGTGGCGC TGAAGCCCGT 2460 

GCTTTTGATG ATGG AG AC CT AG TGAACTG C TCTATCCCTG TCGGCATTCG GTTAG AAAAA 2 52 0 

ATCTC CGAAG ATGAAAAAAA TAATTTCGAG ATTTCTCTAG CCAACATTGG TGATGTGTAT 2 580 

CGTAAAAATC CCCGTTCGCG TACTTCTCTA ATGGTCAGTG GAGCCTCTTG GACTTCGCTA 2 640 

TGTAAAAACC TCGCACGACA AGCCTTCTTA GCAAGTG CTG GAAGCCATCT GACTCTCTCC 2 700 

CCTCATGTAG AACTCTCTGG GGAAGCTGCT TATGAGCTTC ' GTGGCTCAGC AC ACAT CT AC 2 76 0 

AATGTAGATT GTGGGCTAAG AT ACT CATTC TAG 2 793 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 93 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 



Met 


Lys 


lie 


Pro 


Leu 


His 


Lys 


Leu 


Leu 


He 


Ser 


Ser 


Thr 


Leu 


Val 


Thr 


1 








c 
D 










10 










15 




Pro 


lie 


Leu 


Leu 


Ser 


Tin 

lie 


Ala 


Tnr 


Tyr 


Gly 


Ala 


Asp 


Ala 


Ser 


Leu 


Ser 


















z o 










30 






PlTO 


TJnr 


Asp 


Ser 


fne 


Asp 


(jiy 


Ala 


Gly 


Gly 


Ser 


Thr 


Phe 


Thr 


Pro 


Lys 


























45 








Q o -p- 


TJnr 


211 a 
Ala 


Asp 


Ala 


Asn 


Lriy 


inr 


Asn 


Tyr 


Val 


Leu 


Ser 


Gly Asn 


Val 
























60 










iryr 


X _1_ C 


Asn 


Asp 


TV I -s 

a 


biy 


Lys 


vjiy 


i nr 


Ala 


Leu 


Thr 


Gly 


Cys 


Cys 


Phe 


65 










70 










/ ZJ 










OU 


Thr 


Glu 


Thr 


Thr 


Gly 


Asp 


Leu 


Thr 


Phe 


Thr 


Gly 


Lys 


Gly 


Tyr 


Ser 


Phe 










85 










90 










95 




Ser 


Phe 


Asn 


Thr 


Val 


Asp 


Ala 


Gly 


Ser 


Asn 


Ala 


Gly Ala 


Ala 


Ala 


Ser 








100 










105 










110 






Thr 


Thr 


Ala 


Asp 


Lys 


Ala 


Leu 


Thr 


Phe 


Thr 


Gly 


Phe 


Ser 


Asn 


Leu 


Ser 






115 










120 










125 








Phe 


He 


Ala 


Ala 


Pro 


Gly 


Thr 


Thr 


Val 


Ala 


Ser 


Gly 


Lys 


Ser 


Thr 


Leu 




130 










135 










140 










Ser 


Ser 


Ala 


Gly 


Ala 


Leu 


Asn 


Leu 


Thr 


Asp 


Asn 


Gly Thr 


He 


Leu 


Phe 


145 










150 










155 










160 


Ser 


Gin 


Asn 


Val 


Ser 


Asn 


Glu 


Ala 


Asn 


Asn 


Asn 


Gly Gly Ala 


He 


Thr 










165 










170 










175 




Thr 


Lys 


Thr 


Leu 


Ser 


He 


Ser 


Gly 


Asn 


Thr 


Ser 


Ser 


He 


Thr 


Phe 


Thr 








180 










185 










190 






Ser 


Asn 


Ser 


Ala 


Lys 


Lys 


Leu 


Gly 


Gly 


Ala 


He 


Tyr 


Ser 


Ser 


Ala 


Ala 






195 










200 










205 








Ala 


Ser 


He 


Ser 


Gly 


Asn 


Thr 


Gly 


Gin 


Leu 


Val 


Phe 


Met 


Asn 


Asn 


Lys 




210 










215 










220 










Gly 


Glu 


Thr 


Gly 


Gly 


Gly 


Ala 


Leu 


Gly 


Phe 


Glu 


Ala 


Ser 


Ser 


Ser 


He 


225 










230 










235 










240 


Thr 


Gin 


Asn 


Ser 


Ser 


Leu 


Phe 


Phe 


Ser 


Gly 


Asn 


Thr 


Ala 


Thr 


Asp 


Ala 










245 










250 










255 




Ala 


Gly 


Lys 


Gly 


Gly 


Ala 


lie 


Tyr 


Cys 


Glu 


Lys 


Thr 


Gly Glu Thr 


Pro 








260 










265 










270 






Thr 


Leu 


Thr 


He 


Ser 


Gly 


Asn 


Lys 


Ser 


Leu 


Thr 


Phe 


Ala 


Glu 


Asn 


Ser 






275 










280 










285 








Ser 


Val 


Thr 


Gin 


Gly 


Gly 


Ala 


He 


Cys 


Ala 


His 


Gly Leu Asp 


Leu 


Ser 
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290 

Ala Ala Gly Pro 
305 

Ala' Gly Lys Gly 

Leu Ser Ala Asn 
340 

Ser Thr Ser Ala 
355 

Ser Ala Lys lie 
370 

Phe Tyr Asp Pro 
385 

Thr lie Asn Gin 

lie Val Phe Ser 
420 

Asp Asn Phe Thr 
435 

Thr Leu Ala Leu 
450 

Gin Thr Glu Gly 
465 

Ala Asp Thr Glu 

Ala Leu Glu Gly 
500 

Lys Thr lie Thr 
515 

Asn Phe Tyr Glu 
53 0 

Val Val Phe Thr 
545 

Leu Leu Thr Ser 

Gly His Trp Glu 
580 

Thr Met Thr Trp 
595 

Ala Ser Val Val 
610 

Thr Leu Gin Gin 
625 

Arg Gly Leu Trp 

Ser Gly Thr Asn 
660 

Gly Gly Ser Ala 
675 

Cys Gin Leu Phe 
690 

Ser His Asn Tyr 
705 

Gly Gly Leu Pro 

Asp lie Pro Leu 
740 



295 

Thr Leu Phe Ser 
310 

Gly Ala lie Ala 
325 

Gin Gly Asp lie 

Pro Thr Ser Thr 
360 

Thr Asn Leu Arg 
375 

He Ala Ser Asn 
390 

Pro Asp Ser Asn 
405 

Gly Glu Lys Leu 

Ser He Leu Lys 
440 

Lys Gly Asn Val 
455 

Ser Thr Leu Leu 
470 

Ala He Ser Leu 
485 

Asn Lys Ser Val 

Leu Thr Ser Pro 
520 

Ser His Thr He 
535 

Ala Ala Thr Ala 
550 

Pro Val Gin Thr 
565 

Ala Thr Trp Ala 

Val Thr Thr Gly 
600 

Pro Asp Ser Leu 
615 

He Met Thr Ser 
630 

Ala Ser Gly Thr 
645 

Gin Ala Phe Arg 

Glu Asp Phe Ser 
680 

Gly Lys Asp Lys 
695 

Leu Ala Ser Leu 
710 

Met Pro Ser Phe 
72 5 

He Leu Asn Ala 



300 

Asn Asn Arg Cys 
315 

He Ala Asp Ser 
330 

Thr Phe Leu Gly 
345 

Arg Asn Ala He 

Ala Ala Gin Gly 
380 

Thr Thr Gly Ala 
395 

Ser Pro Leu Asp 
410 

Ser Ala Asp Glu 
425 

Gin Pro Leu Ala 

Glu Leu Asp Val 
460 

Met Gin Pro Gly 
475 

Thr Lys Leu Val 
490 

Ser He Glu Thr 
505 

Leu Val Phe Gin 

Asn Gin Ala Phe 
540 

Ala Ser Asp He 
555 

Pro Glu Pro His 
570 

Asp Thr Ser Thr 
585 

Tyr Asn Pro Asn 

Trp Ala Ser Phe 
620 

Gin Ala Asn Ser 
635 

Ala Asn Phe Phe 
650 

His Lys Ser Tyr 
665 

Glu Asn He Phe 

Asp Leu Phe He 
700 

Tyr Leu Gin His 
715 

Gly Ser He Thr 
730 

Gin Leu Ser Tyr 
745 



Gly Asn Thr Ala 
320 

Gly Ser Leu Ser 
335 

Asn Thr Leu Thr 
350 

Tyr Leu Gly Ser 
365 

Gin Ser lie Tyr 

Ser Asp Val Leu 
400 

Tyr Ser Gly Thr 
415 

Ala Lys Ala Ala 
430 

Leu Ala Ser Gly 
445 

Asn Gly Phe Thr 

Thr Lys Leu Lys 
480 

Val Asp Leu Ser 
495 

Ala Gly Ala Asn 
510 

Asp Ser Ser Gly 
525 

Thr Gin Pro Leu 

Tyr He Asp Ala 
560 

Tyr Gly Tyr Gin 
575 

Ala Lys Ser Gly 
590 

Pro Glu Arg Arg 
605 

Thr Asp He Arg 

He Tyr Gin Gin 
640 

His Lys Asp Lys 
655 

Gly Tyr He Val 
670 

Ser Val Ala Phe 
685 

Val Glu Asn Thr 

Arg Ala Phe Leu 
720 

Asp Met Leu Lys 
735 

Ser Tyr Thr Lys 
750 
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Asn Asp Met Asp Thr. Arg Tyr Thr Ser Tyr Pro Glu Ala Gin Gly Ser 

755 760 765 

Trp Thr Asn Asn Ser Gly Ala Leu Glu Leu Gly Gly Ser Leu Ala Leu 

' 770 775 780 

Tyr Leu Pro Lys Glu Ala Pro Phe Phe Gin Gly Tyr Phe Pro Phe Leu 
785 790 795 800 

Lys Phe Gin Ala Val Tyr Ser Arg Gin Gin Asn Phe Lys Glu Ser Gly 

805 810 815 

Ala Glu Ala Arg Ala Phe Asp Asp Gly Asp Leu Val Asn Cys Ser lie 

820 825 830 

Pro Val Gly lie Arg Leu Glu Lys lie Ser Glu Asp Glu Lys Asn Asn 

835 840 845 

Phe Glu lie Ser Leu Ala Asn lie Gly Asp Val Tyr Arg Lys Asn Pro 

850 855 860 

Arg Ser Arg Thr Ser Leu Met Val Ser Gly Ala Ser Trp Thr Ser Leu 
865 870 875 880 

Cys Lys Asn Leu Ala Arg Gin Ala Phe Leu Ala Ser Ala Gly Ser His 

-885 -830 -8 95 

Leu Thr Leu Ser Pro His Val Glu Leu Ser Gly Glu Ala Ala Tyr Glu 

900 905 910 

Leu Arg Gly Ser Ala His lie Tyr Asn Val Asp Cys Gly Leu Arg Tyr 
915 920 925 

Ser Phe 
930 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 840 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 



GAAGACAATA TAAGGTACCG TCATAACAGC GGGGGTTATG CACTAGGGAT CACAGCAACA 60 

ACTCCTGCCG AGGATCAGCT TACTTTTGCC TTCTGCCAGC TCTTTGCTAG AGATCGCAAT 120 

CATATTACAG GTAAGAACCA CGGAGATACT TACGGTGCCT CTTTGTATTT CCACCATACA 180 

GAAGGGCTCT TCGACATCGC CAATTTCCTC TGGGGAAAAG CAACCCGAGC TCCCTGGGTG 2 40 

CTCTCTGAGA TCTCCCAGAT CATTCCTTTA TCGTTCGATG CTAAATTCAG TTATCTCCAT 3 00 

ACAGACAACC ACATGAAGAC ATATTATACC GATAACTCTA T CAT C AAGGG TTCTTGGAGA 3 60 

AACGATGCCT TCTGTGCAGA TCTTGGAGCT AGCCTGCCTT TTGTTATTTC CGTTCCGTAT 420 

CTTCTGAAAG AAGTCGAACC TTTTGTCAAA GTACAGTATA TCTATGCGCA TCAGCAAGAC 4 80 

TTCTACGAGC GTCATGCTGA AGGACGCGCT TTCAATAAAA GCGAGCTTAT CAACGTAGAG 54 0 

ATTCCTATAG GCGTCACCTT CGAAAGAGAC TCAAAATCAG AAAAGGGAAC TTACGATCTT 600 

ACTCTTATGT ATATACT CG A TGCTTACCGA CGCAATCCTA AATGTCAAAC TTCCCTAATA 6 60 

GCTAGCGATG CTAACTGGAT GGCCTATGGT ACCAACCTCG CACGACAAGG TTTTTCTGTT 720 

CGTGCTGCGA ACCATTTCCA AGTGAACCCC CACATGGAAA TCTTCGGTCA ATTCG CTTTT 780 

GAAGTACGAA GTTCTTCACG AAATTATAAT ACAAACCTAG GCTCTAAGTT TTGTTTCTAG 840 



(2) INFORMATION FOR SEQ ID NO: 18: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 79 amino acids 

(B) TYPE: amino acid 
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(C) STRAND EDN ES S : single 
<D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: 



Glu 


Asp 


Asn 


He 


Arcr 


Tvr 


Arg 


His 


Asn 


OCI 


Gly Gly Tyr Ala 


Leu 


Gly 


1 








5 










t n 
1U 










15 


lie 


Thr 


Ala 


Thr 


Thr 


Pro 




ui LI 


Asp 




Leu 


Thr 


Phe 


Ala 


Phe 


Cys 








20 










*■> c 

Z. D 










30 




Gin 


Leu 


Phe 


Ala 


Arg 


ASTD 


Arg 


Asn 


U-! e 

HIS 


T 1 Q 

i xe 


Thr 


Gly 


Lys 


Asn 


His 


Gly 






35 










4 U 










45 






Asp 


Thr 


Tyr Gly 


Ala 


OCX 


Leu 


Tyr 


Phe 


HXS 


His 


Thr 


Glu 


Gly 


Leu 


Phe 




50 




















60 








Asp 


He 


Ala 


Asn 


Phe 


•UCU 


irp 


Gly 


Lys 


Ala 


Thr 


Arg 


Ala 


Pro 


Trp 


Val 


65 










7 n 

/ u 










75 








80 


Leu 


Ser 


Glu 


He 


fl R 




i xe 


He 


Pro 


Leu 
90 


Ser 


Phe 


Asp 


Ala 


Lys 
95 


Phe 


Ser 


x y J - 


Leu 


His 


TViv 


Asp 


Asn 


His 


Met 


Lys 


Thr 


Tyr 


Tyr 


Thr 


Asp 


Asn 








100 










105 










110 




Ser 


Ile 


He 


Lys 


uiy 


o er 


Trp 


Arg 


Asn 


Asp 


Ala 


Phe 


Cys 


Ala 


Asp 


Leu 






115 










120 










125 






Glv 


Ala 


Ser 


Leu 


tr X l_J 


it He 


v ax 


He 


Ser 


Val 


Pro 


Tyr 


Leu 


Leu 


Lys 


Glu 




13 0 










iJD 










140 








Val 


Glu 


Pro 


Phe 


v d J. 


Lys 


vai 


Gin 


Tyr 


He 


Tyr 


Ala 


His 


Gin 


Gin 


Asp 


145 




















155 










160 


Phe 


Tyr 


Glu 


Arg 


His 


Ala 


Glu 


Gly Arg 


Ala 


Phe 


Asn 


Lys 


Ser 


Glu 


Leu 










165 










170 










175 




He 


Asn 


Val 


Glu 


He 


Pro 


He 


Gly Val 


Thr 


Phe 


Glu 


Arg 


Asp 


Ser 


Lys 








180 










185 










190 




Ser 


Glu 


Lys 


Gly 


Thr 


Tyr 


Asp 


Leu 


Thr 


Leu 


Met 


Tyr 


He 


Leu 


Asp 


Ala 






195 










200 










205 






Tyr 


Arg 


Arg 


Asn 


Pro 


Lys 


Cys 


Gin 


Thr 


Ser 


Leu 


He 


Ala 


Ser 


Asp 


Ala 




210 










215 










220 








Asn 


Trp 


Met 


Ala 


Tyr 


Gly 


Thr 


Asn 


Leu 


Ala 


Arg 


Gin 


Gly 


Phe 


Ser 


Val 


225 










230 










235 








240 


Arg 


Ala 


Ala 


Asn 


His 


Phe 


Gin 


Val 


Asn 


Pro 


His 


Met 


Glu 


He 


Phe 


Gly 










245 










250 










255 


Gin 


Phe 


Ala 


Phe 


Glu 


Val 


Arg 


Ser 


Ser 


Ser 


Arg 


Asn 


Tyr 


Asn 


Thr 


Asn 








260 










265 








270 






Leu 


Gly 


Ser 
275 


Lys 


Phe 


Cys 


Phe 





















(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1545 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDN ESS : single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
ATGACCATAC TTCGAAATTT TCTTACCTGC TCGGCTTTAT T-CCTCGCTCT CCCTGCAGCA 
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G CACAAGTTG TATATCTTCA TGAAAGTGAT GGTTATAACG GTGCTATCAA TAATAAAAGC 12 0 

TTAGAACCTA AAATTACCTG TTATCCAGAA GGAACTTCTT A CAT CTTTCT AGATGACGTG 180 

AGGATTTCCA ACGTTAAGCA TGATCAAGAA GATGCTGGGG TTTTTATAAA TCGATCTGGG 24 0 

AATCTTTTTT TCATGGGCAA CCGTTGCAAC TTCACTTTTC ACAACCTTAT GACCGAGGGT 3 00 

TTTGGCGCTG CCATTTCGAA CCGCGTTGGA GACACCACTC TCACTCTCTC T AATTTTT CT 3 60 

TACTTAACGT TCACCTCAGC ACCTCTACTA CCTCAAGGAC AAGGAGCGAT TTATAGTCTT 42 0 

GGTTCCGTGA TGATCGAAAA TAGTGAGGAA GTGACTTTCT GTGGGAACTA CTCTTCGTGG 4 80 

AGTGGAGCTG CGATTTATAC TCCCTACCTT TTAGGTTCTA AGGCGAGTCG TCCTTCAGTA 54 0 

AATCTCAGCG GGAACCGCTA CCTGGTGTTT AGAGACTATG TGAGCCAAGG TTATGGCGGC 6 00 

GCCGTATCTA CCCACAATCT CACACTCACG ACT CGAGGAC CTTCGTGTTT TGAAAATAAT 66 0 

CATGCTTATC ATGACGTGAA TAGTAATGGA GGAGCCATTG CCATTGCTCC TGGAGGATCG 72 0 

ATCTCTATAT CCGTGAAAAG CGGAGATCTC ATCTTCAAAG GAAATACAGC AT CACAAG AC 780 

GGAAATACAA TACACAACTC CAT C CATCTG CAATCTGGAG CACAGTTTAA GAACCTACGT 84 0 

GCTGTTTCAG AATCCGGAGT TTATTTCTAT GATCCTATAA GCCATAGCGA GT CG CAT AAA 900 

ATT AC AG AT C TTGTAAT CAA TGCTCCTGAA GGAAAGGAAA CTTATGAAGG AACAATTAGC 960 

TTCTCAGGAC TATGCCTGGA TGATCATGAA GTTTGTGCGG AAAATCTTAC TTCCACAATC 102 0 

CTACAAGATG TCACATTAGC AGGAGGAACT CTCTCTCTAT CGGATGGGGT TACCTTGCAA 10 80 



CTGCATTCTT TTAAGCAGGA AG CAAG CTCT ACGCTTACTA TGTCTCCAGG AACCACTCTG 1140 

CTCTGCTCAG GAGATGCTCG GGTTCAGAAT CTGCACATCC TGATTGAAGA TACCGACAAC 12 0 0 

TTTGTTCCTG TAAGGATTCG CGCCGAGGAC AAGGATGCTC TTGTCTCATT AGAAAAACTT 12 60 

AAAGTTG CCT TTGAGGCTTA TTGGTCCGTC TATGACTTTC CT CAATTTAA GGAAGCCTTT 132 0 

ACGATTCCTC TTCTTGAACT TCTAGGGCCT TCTTTTGACA GTCTTCTCCT AGGGGAGACC 13 80 

ACTTTGGAGA GAACCCAAGT CACAACAGAG AATGACGCCG TTCGAGGTTT CTGGTCCCTA 144 0 

AGCTGGGAAG AGTACCCCCC TTCTCTGGAT AAAGACAGAA GGATCACACC AACTAAGAAA 150 0 

ACTGTTTTCC TCACTTGGAA TCCTGAGATC ACTTCTACGC CATAA 154 5 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 514 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Met Thr lie Leu Arg Asn Phe Leu Thr Cys Ser Ala Leu Phe Leu Ala 

1 5 10 15 

Leu Pro Ala Ala Ala Gin Val Val Tyr Leu His Glu Ser Asp Gly Tyr 

20 25 30 

Asn Gly Ala He Asn Asn Lys Ser Leu Glu Pro Lys He Thr Cys Tyr 

35 40 45 

Pro Glu Gly Thr Ser Tyr He Phe Leu Asp Asp Val Arg He Ser Asn 

50 55 60 

Val Lys His Asp Gin Glu Asp Ala Gly Val Phe He Asn Arg Ser Gly 
65 70 75 80 

Asn Leu Phe Phe Met Gly Asn Arg Cys Asn Phe Thr Phe His Asn Leu 

85 90 95 

Met Thr Glu Gly Phe Gly Ala Ala He Ser Asn Arg Val Gly Asp Thr 

100 105 110 

Thr Leu Thr Leu Ser Asn Phe Ser Tyr Leu Thr Phe Thr Ser Ala Pro 

115 120 125 

Leu Leu Pro Gin Gly Gin Gly Ala He Tyr Ser Leu Gly Ser Val Met 

130 135 140 

He Glu Asn Ser Glu Glu Val Thr Phe Cys Gly Asn Tyr Ser Ser Trp 
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Ser Gly Ala Ala 

Arg Pro Ser Val 
180 

Tyr Val Ser Gin 
195 

Leu Thr Thr Arg 
210 

Asp Val Asn Ser 
225 

He Ser He Ser 

Ala Ser Gin Asp 
260 

Gly Ala Gin Phe 
275 

Phe Tyr Asp Pro 
290 

Val He Asn Ala 
305 

Phe Ser Gly Leu 

Thr Ser Thr He 
340 

Leu Ser Asp Gly 
355 

Ser Ser Thr Leu 
370 

Asp Ala Arg Val 
385 

Phe Val Pro Val 

Leu Glu Lys Leu 
420 

Phe Pro Gin Phe 
435 

Gly Pro Ser Phe 
450 

Thr Gin Val Thr 
465 

Ser Trp Glu Glu 

Pro Thr Lys Lys 
500 

Thr Pro 



150 

He Tyr Thr Pro 
165 

Asn Leu Ser Gly 

Gly Tyr Gly Gly 
200 

Gly Pro Ser Cys 
215 

Asn Gly Gly Ala 
230 

Val Lys Ser Gly 
245 

Gly Asn Thr lie 

Lys Asn Leu Arg 
280 

He Ser His Ser 
295 

Pro Glu Gly Lys 
310 

Cys Leu Asp Asp 
325 

Leu Gin Asp Val 

Val Thr Leu Gin 
360 

Thr Met Ser Pro 
375 

Gin Asn Leu His 
390 

Arg He Arg Ala 
405 

Lys Val Ala Phe 

Lys Glu Ala Phe 
440 

Asp Ser Leu Leu 
455 

Thr Glu Asn Asp 
470 

Tyr Pro Pro Ser 
485 

Thr Val Phe Leu 



155 

Tyr Leu Leu Gly 
170 

Asn Arg Tyr Leu 
185 

Ala Val Ser Thr 

Phe Glu Asn Asn 
220 

He Ala He Ala 
235 

Asp Leu He Phe 
250 

His Asn Ser lie 
265 

Ala Val Ser Glu 

Glu Ser His Lys 
300 

Glu Thr Tyr Glu 
315 

His Glu Val Cys 
330 

Thr Leu Ala Gly 
345 

Leu His Ser Phe 

Gly Thr Thr Leu 
380 

He Leu He Glu 
395 

Glu Asp Lys Asp 
410 

Glu Ala Tyr Trp 
425 

Thr He Pro Leu 

Leu Gly Glu Thr 
460 

Ala Val Arg Gly 
475 

Leu Asp Lys Asp 
490 

Thr Trp Asn Pro 
505 



160 

Ser Lys Ala Ser 
175 

Val Phe Arg Asp 
190 

His Asn Leu Thr 
205 

His Ala Tyr His 

Pro Gly Gly Ser 
240 

Lys Gly Asn Thr 
255 

His Leu Gin Ser 
270 

Ser Gly Val Tyr 
285 

He Thr Asp Leu 

Gly Thr He Ser 
320 

Ala Glu Asn Leu 
335 

Gly Thr Leu Ser 
350 

Lys Gin Glu Ala 
365 

Leu Cys Ser Gly 

Asp Thr Asp Asn 
400 

Ala Leu Val Ser 
415 

Ser Val Tyr Asp 
430 

Leu Glu Leu Leu 
445 

Thr Leu Glu Arg 

Phe Trp Ser Leu 
480 

Arg Arg He Thr 
495 

Glu He Thr Ser 
510 



(2) INFORMATION FOR SEQ ID NO; 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 787 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 
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(xi) 



SEQUENCE DESCRIPTION: 



SEQ ID NO: 21 : 



ATGAAAACGT 
ACAGCGTTTA 
ATTTTT CCTT 
CTCTACATTG 
AGGGCGGGAG 
TCTTCAGCTG 
TTGAGTTTTT 
ACCTCAGCGA 
TTTACAAACA 
ATTCGAGGCA 
GGATCCATCT 
AGCGCTCCTG 
ACCGGAGGAT 
TCGCGCT 



CTATTCGTAA 
CTGTAGAAGT 
ACACAACACT 
CGAATCTTGA 
CACTACAAAT 
ACGGAGCCGC 
CAGGATTTAG 
GT AATGT CAT 
ATGACTCCAT 
CAAGCATCAC 
CTAATGGAGG 
TGATTTT CTC 
CTATGCTCAC 



GTTCTTAATT 
TATCATGCCT 
TTCTGATCCT 
TAATG CCATA 
CTTAG GAAAA 
GATTAGTAGT 
TCAGATGATC 
ACCTCACGCA 
ACTATTCCAA 
AATAGAAAAT 
GGCCCTCACG 
AACGAATGCT 
CTCTGGGAAC 



TCTACCACAC TGGCGCCATG TTTTGCTTCA 
TCCGAGAACT TTGATGGATC GAGTGGGAAG 
AGAGGGACAC TCTGTATTTT TTCAGGGGAT 
TCCAGAACCT CTTCCAGTTG CTTTAG CAAT 
GGTGGGGTTT TCTCCTTCTT AAATATC CGT 
GTAATCACCC AAAAT CCTGA ACTATGTCCC 
TTCGATAACT GTGAATCTTT GACTT CAG AT 
TCGGCGATTT ACGCTACAAC GCCCATGCTC 
TACAAC CGTT CTGCAGGATT TGGAGCTGCC 
ACGAAAAAGA GCCTTCTCTT TAATGGTAAT 
GGATCTGCAG CGATCAACCT CATCAACAAT 
ACAGGGATCT ATGGTGGGGC TATTTACCTT 
CTCTCAGGAG TCTTGTTCGT TTATAATAGC 



60 
12 0 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
787 



(2 ) -IN FGRMAT-I- QN -FOR- -SEQ ID NO : 22 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 62 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE : peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 



Met 


Lys 


Thr 


Ser 


He 


Arg 


Lys 


Phe 


Leu 


He 


Ser 


Thr 


Thr 


Leu 


Ala 


Pro 


1 








5 










10 










15 




Cys 


Phe 


Ala 


Ser 


Thr 


Ala 


Phe 


Thr 


Val 


Glu 


Val 


He 


Met 


Pro 


Ser 


Glu 








20 










25 










30 






Asn 


Phe 


Asp 


Gly 


Ser 


Ser 


Gly 


Lys 


He 


Phe 


Pro 


Tyr 


Thr 


Thr 


Leu 


Ser 






35 










40 










45 








Asp 


Pro 


Arg 


Gly 


Thr 


Leu 


Cys 


lie 


Phe 


Ser 


Gly 


Asp 


Leu 


Tyr 


He 


Ala 




50 










55 










60 










Asn 


Leu 


Asp 


Asn 


Ala 


He 


Ser 


Arg 


Thr 


Ser 


Ser 


Ser 


Cys 


Phe 


Ser 


Asn 


65 










70 










75 










80 


Arg 


Ala 


Gly 


Ala 


Leu 


Gin 


He 


Leu 


Gly 


Lys 


Gly 


Gly Val 


Phe 


Ser 


Phe 










85 










90 










95 




Leu 


Asn 


He 


Arg 


Ser 


Ser 


Ala 


Asp 


Gly 


Ala 


Ala 


He 


Ser 


Ser 


Val 


He 








100 










105 










110 






Thr 


Gin 


Asn 


Pro 


Glu 


Leu 


Cys 


Pro 


Leu 


Ser 


Phe 


Ser 


Gly 


Phe 


Ser 


Gin 






115 










120 










125 








Met 


He 


Phe 


Asp 


Asn 


Cys 


Glu 


Ser 


Leu 


Thr 


Ser 


Asp 


Thr 


Ser 


Ala 


Ser 




13 0 










135 










140 










Asn 


Val 


He 


Pro 


His 


Ala 


Ser 


Ala 


He 


Tyr 


Ala 


Thr 


Thr 


Pro 


Met 


Leu 


145 










150 










155 










160 


Phe 


Thr 


Asn 


Asn 


Asp 


Ser 


He 


Leu 


Phe 


Gin 


Tyr 


Asn 


Arg 


Ser 


Ala 


Gly 










165 










170 










175 




Phe 


Gly 


Ala 


Ala 


He 


Arg 


Gly 


Thr 


Ser 


He 


Thr 


He 


Glu 


Asn 


Thr 


Lys 








180 










185 










190 






Lys 


Ser 


Leu 


Leu 


Phe 


Asn 


Gly 


Asn 


Gly 


Ser 


He 


Ser 


Asn 


Gly 


Gly 


Ala 






195 










200 










205 








Leu 


Thr 


Gly 


Ser 


Ala 


Ala 


He 


Asn 


Leu 


He 


Asn 


Asn 


Ser 


Ala 


Pro 


Val 



210 215 220 
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e Phe Ser Thr Asn Ala Thr Gly He Tyr Gly Gly Ala He Tyr Leu 



Val Tyr Asn Ser Ser Arg 
260 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2838 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

ATGAAGACTT CAGTTTCTAT GTTGTTGGCC CTGCTTTGCT CGGGGGCTAG CTCTATTGTA 60 

CTCCATGCCG CAACCACTCC ACTAAATCCT GAAGATGGGT TTATTGGGGA GGGCAATACA 12 0 

AATACTTTTT CTCCGAAATC TACAACGGAT GCTGCAGGAA CTACCTACTC TCTCACAGGA 180 

GAGGTTCTGT TTATAGATCC GGGGAAAGGT GGTTCAATTA CAGGAACTTG CTTTGTAGAA 24 0 

ACTGCTGGCG AT CTT AC ATT TTTAGGTAAT GGAAATACCC TAAAGTTCCT GTCGGTAGAT 300 

GCAGGTGCTA ATATCGCGGT TGCTCATGTA CAAGGAAGTA AGAATTTAAG CTTCACAGAT 360 

TTCCTTTCTC TGGTGATCAC AGAATCTCCA AAATCCGCTG TTAGTACAGG AAAAGGTAGC 420 

CTAGTCAGTT CAGGTGCAGT CCAACTGCAA GATATAAACA CTCTAGTTCT TACAAGCAAT 480 

GCCTCTGTCG AAGATGGTGG CGTGATTAAA GGAAACTCCT GCTTGATTCA GGGAATCAAA 540 

AATAGTGCGA TTTTTGGACA AAATACATCT TCGAAAAAAG GAGGGGCGAT CTCCACGACT 600 

CAAGGACTCA CCATAGAGAA TAACTTAGGG ACGCTAAAGT TCAATGAAAA CAAAGCAGTG 660 

ACCTCAGGAG GCGC CTTAGA TTTAGGAGCC GCGTCTACAT TCACTGCGAA CCATGAGTTG 72 0 

ATATTTTCAC AAAATAAGAC TTCTGGGAAT GCTGCAAATG GCGGAGCCAT AAATTGCTCA 780 

GGCGACCTAA CATTTACTGA TAACACTTCT TTGTTACTTC AAGAAAATAG CACAATGCAG 84 0 

GATGGTGGAG CTTTGTGTAG CACAGGAACC ATAAGCATTA CCGGTAGTGA TTCTATCAAT 900 

GTGATAGGAA ATACTTCAGG ACAAAAAGGA GGAGCGATTT CTGCAGCTTC TCTCAAGATT 960 

TTGGGAGGGC AGGGAGGCGC TCTCTTTTCT AATAACGTAG TGACTCATGC CACCCCTCTA 102 0 

GGAGGTGCCA TTTTTATCAA CACAGGAGGA TCCTTGCAGC T CTT C ACT CA AGGAGGGGAT 1080 

ATCGTATTCG AGGGGAATCA GGTCACTACA ACAGCTCCAA ATGCTACCAC TAAGAGAAAT 1140 

GTAATT C AC C TCGAGAGCAC CGCGAAGTGG ACGGGACTTG CTGCAAGTCA AGGTAACGCT 1200 

ATCTATTTCT ATGATCCCAT TACCACCAAC GATACGGGAG CAAGCGATAA CTTACGTAT C 1260 

AATGAGGTCA GTGCAAATCA AAAGCTCTCG GGATCTATAG TATTTTCTGG AGAGAGATTG 1320 

TCGACAGCAG AAGCTATAGC TGAAAATCTT ACTTCGAGGA TCAACCAGCC TGTCACTTTA 13 80 

GTAGAGGGGA GCTTAGAACT TAAACAGGGA GTGACCTTGA TCACACAAGG ATTCTCGCAG 1440 

GAGCCAGAAT CCACGCTTCT TTTGGATTTG GGGACCTCAT TACAAGCTTC TACAGAAGAT 1500 

ATCGTCATCA CAAATTCATC TATAAATGCC GATACCATTT ACGGAAAGAA TCCAATCAAT 1560 

ATTGTAGCTT CAGCAGCGAA TAAGAACATT AC C CTAACAG GAACCTTAGC ACTTGTAAAT 162 0 

GCAGATGGAG CTTTGTATGA GAACCATACC TTG CAAGACT CTCAAGATTA TAGCTTTGTA 1680 

AAGTTATCTC CAGGAGCGGG AGGGACTATA ATTACTCAAG ATGCTTCTCA GAAGCTTCTT 1740 

GAAGTAGCTC CTTCTAGACC ACATTATGGC TAT CAAGGAC ATTGGAATGT GCAAGTCATC 1800 

CCAGGAACGG GAACTCAACC GAGCCAGGCA AATTTAGAAT GGGTG CGGAC AGGATACCTT 1860 

CCGAATCCCG AACGGCAAGG ATTTTTAGTT CCCAATAGCC TGTGGGGTTC TTTTGTTGAT 1920 

CAGCGTGCTA TCCAAGAAAT CATGGTAAAT AGTAGC CAAA TCTTATGTCA GGAACGGGGA 1980 

GTCTGGGGAG CTGGAATTGC TAATTTCCTA CATAGAGATA AAATTAATGA GCACGG CTAT 2 040 

CGC CATAGCG GTGTCGGTTA TCTTGTGGGA GTTGGCACTC ATGCTTTTTC TGATGCTACG 2100 

ATAAATGCGG CTTTTTGC CA GCTCTTCAGT AGAGATAAAG ACTACGTAGT ATCCAAAAAT 2160 

CATGGAACTA GCTACTCAGG GGT CGTATTT CTTGAGGATA CCCTAGAGTT TAGAAGTCCA 2220 

CAGGGATTCT ATACTGATAG CTCCTCAGAA GCTTGCTGTA ACCAAGTCGT CACTATAGAT 2280 
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ATG CAGTTGT CTTACAGCCA TAGAAATAAT GATATGAAAA CCAAATACAC GACATATCCA 2340 

GAAGCTCAGG GATCTTGGGC AAATGATGTT TTTGGT CTTG AGTTTGGAGC GACTACATAC 2400 

TACTACCCTA ACAGTACTTT TTTATTTGAT TACTACTCTC CGTTTCTCAG GCTGCAGTGC 2 460 

ACCTATGCTC ACCAGGAAGA CTT C AAAG AG ACAGGAGGTG AGGTTCGTCA CTTTACTAGC 2 520 

GGAGAT CTTT TCAATTTAGC AG TTC CTATT GGCGTGAAGT. TTGAGAGATT TTCAGACTGT 2 580 

AAAAGGGGAT CTTATGAACT TACCCTTGCT TATGTTCCTG ATGTGATTCG CAAAGATCCC 2 64 0 

AAGAGCACGG CAACATTGGC TAGTGGAGCT ACGTGGAGCA CCCACGGAAA CAATCTCTCC 2 700 

AGACAAGGAT TACAACTGCG TTTAGGGAAC CACTGTCTCA TAAATCCTGG AATTGAGGTG 2 760 

TTCAGTCACG GAGCTATTGA ATTGCGGGGA TCCTCTCGTA ATTATAACAT CAATCTCGGG 2 820 

GGTAAATACC GATTTTAA 2 83 8 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 94 6 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) "TOPOtOGYT linear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 





Lys 


x nr 


C<a v 
DCI 


vai 


Ser 


Met: 


Leu 


T »i i 

Leu 


Ala 


Leu 


Leu 


Cys 


Ser 


Gly Ala 


X 








5 










10 










15 




Ser 




Tic* 

lie 


Vdl 


Leu 


T_J -J — 

HIS 


Ala 


Ala 


Tnr 


Thr 


Pro 


Leu 


Asn 


Pro 


Glu 


Asp 


















25 










30 






Gly 


Phe 


He 


Gly 


Glu 


Gly 


Asn 


Thr 


Asn 


Thr 


Phe 


Ser 


Pro 


Lys 


Ser 


Thr 






35 










40 










45 








Thr 


Asp 


Ala 


Ala 


Gly 


Thr 


Thr 


Tyr 


Ser 


Leu 


Thr 


Gly Glu 


Val 


Leu 


Phe 




50 










55 










60 










He 


Asp 


Pro 


Gly 


Lys 


Gly 


Gly 


Ser 


He 


Thr 


Gly 


Thr 


Cys 


Phe 


Val 


Glu 


65 










70 










75 










80 


Thr 


Ala 


Gly 


Asp 


Leu 


Thr 


Phe 


Leu 


Gly 


Asn 


Gly 


Asn 


Thr 


Leu 


Lys 


Phe 










85 










90 










95 




Leu 


Ser 


Val 


Asp 


Ala 


Gly 


Ala 


Asn 


He 


Ala 


Val 


Ala 


His 


Val 


Gin 


Gly 








100 










105 










110 






Ser 


Lys 


Asn 


Leu 


Ser 


Phe 


Thr 


Asp 


Phe 


Leu 


Ser 


Leu 


Val 


lie 


Thr 


Glu 






115 










120 










125 








Ser 


Pro 


Lys 


Ser 


Ala 


Val 


Ser 


Thr 


Gly 


Lys 


Gly 


Ser 


Leu 


Val 


Ser 


Ser 




130 










135 










140 










Gly 


Ala 


Val 


Gin 


Leu 


Gin 


Asp 


He 


Asn 


Thr 


Leu 


Val 


Leu 


Thr 


Ser 


Asn 


145 










150 










155 










160 


Ala 


Ser 


Val 


Glu 


Asp 


Gly 


Gly 


Val 


He 


Lys 


Gly 


Asn 


Ser 


Cys 


Leu 


He 










165 










170 










175 




Gin 


Gly 


He 


Lys 


Asn 


Ser 


Ala 


He 


Phe 


Gly 


Gin 


Asn 


Thr 


Ser 


Ser 


Lys 








180 










185 










190 






Lys 


Gly 


Gly 


Ala 


He 


Ser 


Thr 


Thr 


Gin 


Gly 


Leu 


Thr 


He 


Glu 


Asn 


Asn 






195 










200 










205 








Leu 


Gly 


Thr 


Leu 


Lys 


Phe 


Asn 


Glu 


Asn 


Lys 


Ala 


Val 


Thr 


Ser 


Gly Gly 




210 










215 










220 










Ala 


Leu 


Asp 


Leu 


Gly 


Ala 


Ala 


Ser 


Thr 


Phe 


Thr 


Ala 


Asn 


His 


Glu 


Leu 


225 










230 










235 










240 


He 


Phe 


Ser 


Gin 


Asn 


Lys 


Thr 


Ser 


Gly 


Asn 


Ala 


Ala 


Asn 


Gly 


Gly Ala 










245 










250 










255 




He 


Asn 


Cys 


Ser 


Gly 


Asp 


Leu 


Thr 


Phe 


Thr 


Asp 


Asn 


Thr 


Ser 


Leu 


Leu 








260 










265 










270 
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Leu Gin Glu Asn Ser Thr Met Gin Asp Gly Gly Ala Leu Cys Ser Thr 

275 280 285 

Gly Thr He Ser He Thr Gly Ser Asp Ser He Asn Val He Gly Asn 

290 295 300 

Thr Ser Gly Gin Lys Gly Gly Ala lie Ser Ala Ala Ser Leu Lys He 
305 310 315 320 

Leu Gly Gly Gin Gly Gly Ala Leu Phe Ser Asn Asn Val Val Thr His 

325 330 335 

Ala Thr Pro Leu Gly Gly Ala He Phe He Asn Thr Gly Gly Ser Leu 

340 345 350 

Gin Leu Phe Thr Gin Gly Gly Asp lie Val Phe Glu Gly Asn Gin Val 

355 360 365 

Thr Thr Thr Ala Pro Asn Ala Thr Thr Lys Arg Asn Val lie His Leu 

370 375 380 

Glu Ser Thr Ala Lys Trp Thr Gly Leu Ala Ala Ser Gin Gly Asn Ala 
385 390 395 400 

He Tyr Phe Tyr Asp Pro He Thr Thr Asn Asp Thr Gly Ala Ser Asp 

405 410 415 

Asn Leu Arg He Asn Glu Val Ser Ala Asn Gin Lys Leu Ser Gly Ser 

42 0 425 430 

lie Val Phe Ser Gly Glu Arg Leu Ser Thr Ala Glu Ala lie Ala Glu 

435 440 445 

Asn Leu Thr Ser Arg He Asn Gin Pro Val Thr Leu Val Glu Gly Ser 

450 455 460 

Leu Glu Leu Lys Gin Gly Val Thr Leu lie Thr Gin Gly Phe Ser Gin 
465 47 0 475 480 

Glu Pro Glu Ser Thr Leu Leu Leu Asp Leu Gly Thr Ser Leu Gin Ala 

485 490 495 

Ser Thr Glu Asp lie Val lie Thr Asn Ser Ser lie Asn Ala Asp Thr 

5 00 505 510 

lie Tyr Gly Lys Asn Pro He Asn lie Val Ala Ser Ala Ala Asn Lys 

515 520 525 

Asn lie Thr Leu Thr Gly Thr Leu Ala Leu Val Asn Ala Asp Gly Ala 

53 0 535 540 

Leu Tyr Glu Asn His Thr Leu Gin Asp Ser Gin Asp Tyr Ser Phe Val 
545 550 555 560 

Lys Leu Ser Pro Gly Ala Gly Gly Thr lie lie Thr Gin Asp Ala Ser 

565 570 575 

Gin Lys Leu Leu Glu Val Ala Pro Ser Arg Pro His Tyr Gly Tyr Gin 

580 585 590 

Gly His Trp Asn Val Gin Val lie Pro Gly Thr Gly Thr Gin Pro Ser 

595 600 605 

Gin Ala Asn Leu Glu Trp Val Arg Thr Gly Tyr Leu Pro Asn Pro Glu 

610 615 620 

Arg Gin Gly Phe Leu Val Pro Asn Ser Leu Trp Gly Ser Phe Val Asp 
625 "0 635 640 

Gin Arg Ala He Gin Glu He Met Val Asn Ser Ser Gin lie Leu Cys 

645 650 655 

Gin Glu Arg Gly Val Trp Gly Ala Gly lie Ala Asn Phe Leu His Arg 

660 665 670 

Asp Lys lie Asn Glu His Gly Tyr Arg His Ser Gly Val Gly Tyr Leu 

675 680 685 

Val Gly Val Gly Thr His Ala Phe Ser Asp Ala Thr He Asn Ala Ala 

690 695 700 

Phe Cys Gin Leu Phe Ser Arg Asp Lys Asp Tyr Val Val Ser Lys Asn 
705 7 10 715 720 

H 1S Gly Thr Ser Tyr Ser Gly Val Val Phe Leu Glu Asp Thr Leu Glu 
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Phe 


Arg 


Ser 


Pro 
740 


Gin 


Gly 


Phe 


Tyr 


Thr 
74 c 


Asp 


Ser 


Ser 


Ser 


Glu 
750 


Ala 


Cys 


Cys 


Asn 


Gin 
755 


Val 


Val 


Thr 


lie 


Asp 

7 £ A 


Met 


Gin 


Leu 


Ser 


Tyr 
765 


Ser 


His 


Arg 


Asn 


Asn Asp 


Met 


Lys 


Thr 


Lys 


Tyr 


Thr 


Thr 


Tyr 


Pro 


Glu 


Ala 


Gin 


Gly 
























780 










Ser 


Trp 


/-i J. a. 


Asn 


Asp 


Val 


Phe 


Gly 


Leu 


Glu 


Phe 


Gly Ala Thr 


Thr 


Tyr 


785 










790 










795 










Ann 


± yi. 


Tyr 


Pro 


Asn 


Ser 

one 


Thr 


rile 


Leu 


rlic 


Asp 
oil) 


Tyr 


Tyr 


Ser 


Pro 


nu ^ 

fne 
8 15 


Leu 


>\xy 


Leu 


/->( -v _ 

Gin 


Cys 

o ^ u 




iyr 


nla 


til S 


Gin 
825 


Glu 


Asp 


Phe 


Lys 


Glu 
830 


Hp Vi 

I nr 


(aly 


ox y 


Glu 


Val 
835 


Arg 


nib 




1 III 


Ser 

0 yi n 


Gly 


Asp 


Leu 


Phe 


Asn 
845 


Leu 


Ala 


val 


Pro 


lie 
850 


Gly 


Val 


Lys 


Phe 


Glu 

ODD 


Arg 


Phe 


Ser 


Asp 


Cys 
860 


Lys 


Arg 


Gly 


Ser 


Tyr 


Glu 


Leu 


Thr 


Leu 


Ala 


Tyr 


Val 


Pro 


Asp 


Val 


He 


Arg 


Lys 


Asp 


Pro 


o c c 

bob 










870 










875 










880 


Lys 


Ser 


Thr 


Ala 


Thr 


Leu 


Ala 


Ser 


Gly Ala 


Thr 


Trp 


Ser 


Thr 


His 


Gly 










885 










890 










895 




Asn 


Asn 


Leu 


Ser 


Arg 


Gin 


Gly 


Leu 


Gin 


Leu 


Arg 


Leu 


Gly Asn 


His 


Cys 








900 










905 










910 






Leu 


He 


Asn 
915 


Pro 


Gly 


He 


Glu 


Val 
920 


Phe 


Ser 


His 


Gly 


Ala 
925 


He 


Glu 


Leu 


Arg 


Gly 
930 


Ser 


Ser 


Arg 


Asn 


Tyr 
935 


Asn 


He 


Asn 


Leu 


Gly 
940 


Gly 


Lys 


Tyr 


Arg 



Phe 



945 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3000 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY: Coding Sequence 

(B) LOCATION: 259... 3000 
(D) OTHER INFORMATION: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

ATCAGGTGAT AAAAGTTCCT CGTTAG CT AG TGACTGTAGG TGACATGAGA AAGCTAACAC 60 

GGAGGAAACT AAAACCCAAG GAATCGAAGT CTTCATGGTA ATGCTTTTGT TTTTTAGAGA 120 

ACTATTCGCA TCAATATAGA AACAAAATAA GTAAATCAAG TTAAAGATGA C AAAAC AG CT 180 

GTCAAGAATT TTTATCTTGA CTCTCTGAGT TTTCTATTTT ATATGACGCA AGTAAGAATT 2 40 

TAATAATAAA GTGGGTTT ATG AAA TCG CAA TTT TCC TGG TTA GTG CTC TCT 2 91 

Met Lys Ser Gin Phe Ser Trp Leu Val Leu Ser 
15 10 
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TCG ACA TTG GCA TGT TTT ACT AGT TGT TCC ACT GTT TTT GCT GCA ACT 33 9 

Ser Thr Leu Ala Cys Phe Thr Ser Cys Ser Thr Val Phe Ala Ala Thr 
15 20 25 

GCT GAA AAT ATA GGC CCC TCT GAT AGC TTT GAC GGA AGT ACT AAC ACA 3 87 

Ala Glu Asn He Gly Pro Ser Asp Ser Phe Asp Gly Ser Thr Asn Thr 
30 35 40 

GGC ACC TAT ACT CCT AAA AAT ACG ACT ACT GGA ATA GAC TAT ACT CTG 4 35 

Gly Thr Tyr Thr Pro Lys Asn Thr Thr Thr Gly He Asp Tyr Thr Leu 
45 50 55 

ACA GGA GAT ATA ACT CTG CAA AAC CTT GGG GAT TCG GCA GCT TTA ACG 4 83 

Thr Gly Asp He Thr Leu Gin Asn Leu Gly Asp Ser Ala Ala Leu Thr 
60 65 70 75 

AAG GGT TGT TTT TCT GAC ACT ACG GAA TCT TTA AGC TTT GCC GGT AAG 531 
Lys Gly Cys Phe Ser Asp Thr Thr Glu Ser Leu Ser Phe Ala Gly Lys 
80 85 90 

GGG TAC TCA CTT TCT TTT TTA AAT ATT AAG TCT AGT GCT GAA GGC GCA 57 9 

Gly Tyr Ser Leu Ser Phe Leu Asn He Lys Ser Ser Ala Glu Gly Ala 
95 100 105 

GCA CTT TCT GTT ACA ACT GAT AAA AAT CTG TCG CTA ACA GGA TTT TCG 62 7 

Ala Leu Ser Val Thr Thr Asp Lys Asn Leu Ser Leu Thr Gly Phe Ser 
110 115 120 

AGT CTT ACT TTC TTA GCG GCC CCA TCA TCG GTA ATC ACA ACC CCC TCA 675 
Ser Leu Thr Phe Leu Ala Ala Pro Ser Ser Val He Thr Thr Pro Ser 
125 130 135 

GGA AAA GGT GCA GTT AAA TGT GGA GGG GAT CTT ACA TTT GAT AAC AAT 723 
Gly Lys Gly Ala Val Lys Cys Gly Gly Asp Leu Thr Phe Asp Asn Asn 
140 145 150 155 

GGA ACT ATT TTA TTT AAA CAA GAT TAC TGT GAG GAA AAT GGC GGA GCC 771 
Gly Thr He Leu Phe Lys Gin Asp Tyr Cys Glu Glu Asn Gly Gly Ala 
160 165 170 

ATT TCT ACC AAG AAT CTT TCT TTG AAA AAC AGC ACG GGA TCG ATT TCT 819 
lie Ser Thr Lys Asn Leu Ser Leu Lys Asn Ser Thr Gly Ser He Ser 
175 180 185 

TTT GAA GGG AAT AAA TCG AGC GCA ACA GGG AAA AAA GGT GGG GCT ATT 86 7 

Phe Glu Gly Asn Lys Ser Ser Ala Thr Gly Lys Lys Gly Gly Ala He 
190 195 200 

TGT GCT ACT GGT ACT GTA GAT ATT ACA AAT AAT ACG GCT CCT ACC CTC 915 
Cys Ala Thr Gly Thr Val Asp He Thr Asn Asn Thr Ala Pro Thr Leu 
205 210 215 

TTC TCG AAC AAT ATT GCT GAA GCT GCA GGT GGA GCT ATA AAT AGC ACA 96 3 

Phe Ser Asn Asn He Ala Glu Ala Ala Gly Gly Ala He Asn Ser Thr 
220 225 230 235 

GGA AAC TGT ACA ATT ACA GGG AAT ACG TCT CTT GTA TTT TCT GAA AAT 1011 
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Gly Asn Cys Thr lie Thr Gly Asn Thr Ser Leu Val Phe Ser Glu Asn 
240 245 250 

AGT GTG ACA GCG ACC GCA GGA AAT GGA GGA GCT CTT TCT GGA GAT GCC 105 9 

Ser Val Thr Ala Thr Ala Gly Asn Gly Gly Ala Leu Ser Gly Asp Ala 
255 260 265 

GAT GTT ACC ATA TCT GGG AAT CAG AGT GTA ACT TTC TCA GGA AAC CAA 110 7 

Asp Val Thr He Ser Gly Asn Gin Ser Val Thr Phe Ser Gly Asn Gin 
270 275 280 

GCT GTA GCT AAT GGC GGA GCC ATT TAT GCT AAG AAG CTT ACA CTG GCT 1155 
Ala Val Ala Asn Gly Gly Ala He Tyr Ala Lys Lys Leu Thr Leu Ala 
285 290 295 

TCC GGG GGG GGG GGG GGT ATC TCC TTT TCT AAC AAT ATA GTC CAA GGT 1203 
Ser Gly Gly Gly Gly Gly He Ser Phe Ser Asn Asn He Val Gin Gly 
3"0"0 3~0~5 310 3T5 

ACC ACT GCA GGT AAT GGT GGA GCC ATT TCT ATA CTG GCA GCT GGA GAG 12 51 

Thr Thr Ala Gly Asn Gly Gly Ala He Ser He Leu Ala Ala Gly Glu 
320 325 330 

TGT AGT CTT TCA GCA GAA GCA GGG GAC ATT ACC TTC AAT GGG AAT GCC 12 9 9 

Cys Ser Leu Ser Ala Glu Ala Gly Asp He Thr Phe Asn Gly Asn Ala 
335 340 345 

ATT GTT GCA ACT ACA CCA CAA ACT ACA AAA AGA AAT TCT ATT GAC ATA 134 7 

He Val Ala Thr Thr Pro Gin Thr Thr Lys Arg Asn Ser He Asp He 
350 355 360 

GGA TCT ACT GCA AAG ATC ACG AAT TTA CGT GCA ATA TCT GGG CAT AGC 13 95 

Gly Ser Thr Ala Lys He Thr Asn Leu Arg Ala He Ser Gly His Ser 
365 370 375 

ATC TTT TTC TAC GAT CCG ATT ACT GCT AAT ACG GCT GCG GAT TCT ACA 14 4 3 
He Phe Phe Tyr Asp Pro He Thr Ala Asn Thr Ala Ala Asp Ser Thr 
380 385 390 395 

GAT ACT TTA AAT CTC AAT AAG GCT GAT GCA GGT AAT AGT ACA GAT TAT 14 91 

Asp Thr Leu Asn Leu Asn Lys Ala Asp Ala Gly Asn Ser Thr Asp Tyr 
400 405 410 

AGT GGG TCG ATT GTT TTT TCT GGT GAA AAG CTC TCT GAA GAT GAA GCA 153 9 
Ser Gly Ser He Val Phe Ser Gly Glu Lys Leu Ser Glu Asp Glu Ala 
415 420 425 

AAA GTT GCA GAC AAC CTC ACT TCT ACG CTG AAG CAG CCT GTA ACT CTA 1587 
Lys Val Ala Asp Asn Leu Thr Ser Thr Leu Lys Gin Pro Val Thr Leu 
430 435 440 

ACT GCA GGA AAT TTA GTA CTT AAA CGT GGT GTC ACT CTC GAT ACG AAA 163 5 
Thr Ala Gly Asn Leu Val Leu Lys Arg Gly Val Thr Leu Asp Thr Lys 
445 450 455 

GGC TTT ACT CAG ACC GCG GGT TCC TCT GTT ATT ATG GAT GCG GGC ACA 16 83 
Gly Phe Thr Gin Thr Ala Gly Ser Ser Val He Met Asp Ala Gly Thr 
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460 465 470 475 

ACG TTA AAA GCA AGT ACA GAG GAG GTC ACT TTA ACA GGT CTT TCC ATT 1731 
Thr Leu Lys Ala Ser Thr Glu Glu Val Thr Leu Thr Gly Leu Ser lie 
480 485 490 

CCT GTA GAC TCT TTA GGC GAG GGT AAG AAA GTT GTA ATT GCT GCT TCT 177 9 
Pro Val Asp Ser Leu Gly Glu Gly Lys Lys Val Val He Ala Ala Ser 
495 500 505 

GCA GCA AGT AAA AAT GTA GCC CTT AGT GGT CCG ATT CTT CTT TTG GAT 182 7 
Ala Ala Ser Lys Asn Val Ala Leu Ser Gly Pro He Leu Leu Leu Asp 
510 515 520 

AAC CAA GGG AAT GCT TAT GAA AAT CAC GAC TTA GGA AAA ACT CAA GAC 1875 
Asn Gin Gly Asn Ala Tyr Glu- Asn His Asp Leu Gly Lys Thr Gin Asp 
525 530 535 

TTT TCA TTT GTG CAG CTC TCT GCT CTG GGT ACT GCA ACA ACT ACA GAT 1923 
Phe Ser Phe Val Gin Leu Ser Ala Leu Gly Thr Ala Thr Thr Thr Asp 
540 545 550 555 

GTT CCA GCG GTT CCT ACA GTA GCA ACT CCT ACG CAC TAT GGG TAT CAA 1971 
Val Pro Ala Val Pro Thr Val Ala Thr Pro Thr His Tyr Gly Tyr Gin 
560 565 570 

GGT ACT TGG GGA ATG ACT TGG GTT GAT GAT ACC GCA AGC ACT CCA AAG 2 019 
Gly Thr Trp Gly Met Thr Trp Val Asp Asp Thr Ala Ser Thr Pro Lys 
57 5 580 585 

ACT AAG ACA GCG ACA TTA GCT TGG ACC AAT ACA GGC TAC CTT CCG AAT 206 7 
Thr Lys Thr Ala Thr Leu Ala Trp Thr Asn Thr Gly Tyr Leu Pro Asn 
59 ° 595 600 

CCT GAG CGT CAA GGA CCT TTA GTT CCT AAT AGC CTT TGG GGA TCT TTT 2115 
Pro Glu Arg Gin Gly Pro Leu Val Pro Asn Ser Leu Trp Gly Ser Phe 
6 °5 610 615 

TCA GAC ATC CAA GCG ATT CAA GGT GTC ATA GAG AGA AGT GCT TTG ACT 2163 
Ser Asp He Gin Ala He Gin Gly Val He Glu Arg Ser Ala Leu Thr 
620 625 630 635 

CTT TGT TCA GAT CGA GGC TTC TGG GCT GCG GGA GTC GCC AAT TTC TTA 2211 
Leu Cys Ser Asp Arg Gly Phe Trp Ala Ala Gly Val Ala Asn Phe Leu 
640 645 650 

GAT AAA GAT AAG AAA GGG GAA AAA CGC AAA TAC CGT CAT AAA TCT GGT 225 9 
Asp Lys Asp Lys Lys Gly Glu Lys Arg Lys Tyr Arg His Lys Ser Gly 
655 660 665 

GGA TAT GCT ATC GGA GGT GCA GCG CAA ACT TGT TCT GAA AAC TTA ATT 23 07 
Gly Tyr Ala lie Gly Gly Ala Ala Gin Thr Cys Ser Glu Asn Leu He 
670 675 680 

AGC TTT GCC TTT TGC CAA CTC TTT GGT AGC GAT AAA GAT TTC TTA GTC 235 5 
Ser Phe Ala Phe Cys Gin Leu Phe Gly Ser Asp Lys Asp Phe Leu Val 
685 690 695 
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GCT AAA AAT CAT ACT GAT ACC TAT GCA GGA GCC TTC TAT ATC CAA CAC 2 403 

Ala Lys Asn His Thr Asp Thr Tyr Ala Gly Ala Phe Tyr lie Gin His 
700 705 710 715 

ATT ACA GAA TGT AGT GGG TTC ATA GGT TGT CTC TTA GAT AAA CTT CCT 2 451 
lie Thr Glu Cys Ser Gly Phe lie Gly Cys Leu Leu Asp Lys Leu Pro 
720 725 730 

GGC TCT TGG AGT CAT AAA CCC CTC GTT TTA GAA GGG CAG CTC GCT TAT 24 9 9 
Gly Ser Trp Ser His Lys Pro Leu Val Leu Glu Gly Gin Leu Ala Tyr 
735 740 745 

AGC CAC GTC AGT AAT GAT CTG AAG ACA AAG TAT ACT GCG TAT CCT GAG 254 7 
Ser His Val Ser Asn Asp Leu Lys Thr Lys Tyr Thr Ala Tyr Pro Glu 
750 755 760 

GTG AAA GGT TCT TGG GGG AAT AAT GCT TTT AAC ATG ATG TTG GGA GCT 2595 
Val Lys "Gly "Ser Trp "Gly Ash "Ash Ala Phe Ash Met Met Leu Gly Ala 
765 770 775 

TCT TCT CAT TCT TAT CCT GAA TAC CTG CAT TGT TTT GAT ACC TAT GCT 2 64 3 
Ser Ser His Ser Tyr Pro Glu Tyr Leu His Cys Phe Asp Thr Tyr Ala 
780 785 790 795 

CCA TAC ATC AAA CTG AAT CTG ACC TAT ATA CGT CAG GAC AGC TTC TCG 2 6 91 
Pro Tyr lie Lys Leu Asn Leu Thr Tyr lie Arg Gin Asp Ser Phe Ser 
800 805 810 

GAG AAA GGT ACA GAA GGA AGA TCT TTT GAT GAC AGC AAC CTC TTC AAT 2 73 9 
Glu Lys Gly Thr Glu Gly Arg Ser Phe Asp Asp Ser Asn Leu Phe Asn 
815 820 825 

TTA TCT TTG CCT ATA GGG GTG AAG TTT GAG AAG TTC TCT GAT TGT AAT 2 7 87 
Leu Ser Leu Pro lie Gly Val Lys Phe Glu Lys Phe Ser Asp Cys Asn 
830 835 840 

GAC TTT TCT TAT GAT CTG ACT TTA TCC TAT GTT CCT GAT CTT ATC CGC 2835 
Asp Phe Ser Tyr Asp Leu Thr Leu Ser Tyr Val Pro Asp Leu lie Arg 
845 850 855 

AAT GAT CCC AAA TGC ACT ACA GCA CTT GTA ATC AGC GGA GCC TCT TGG 2 883 
Asn Asp Pro Lys Cys Thr Thr Ala Leu Val lie Ser Gly Ala Ser Trp 
860 865 870 875 

GAA ACT TAT GCC AAT AAC TTA GCA CGA CAG GCC TTG CAA GTG CGT GCA 2 931 
Glu Thr Tyr Ala Asn Asn Leu Ala Arg Gin Ala Leu Gin Val Arg Ala 
880 885 890 

GGC AGT CAC TAC GCC TTC TCT CCT ATG TTT GAA GTG CTC GGC CAG TTT 2 97 9 
Gly Ser His Tyr Ala Phe Ser Pro Met Phe Glu Val Leu Gly Gin Phe 
895 900 905 

GTC TTT GAA GTT CGT GGA TCC 3000 
Val Phe Glu Val Arg Gly Ser 
910 



SUBSTITUTE SHEET (RULE 26) 



WO 98/58953 



PCT/DK98/00266 



78 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 914 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 



Met 


Lys 


Ser 


Gin 


Phe 


Ser 


Trp 


Leu 


Val 


Leu 


Ser 


Ser 


Thr 


Leu 


Ala 


Cys 


1 








5 










10 










15 


Phe 


Thr 


Ser 


Cys 


Ser 


Thr 


Val 


Phe 


Ala 


Ala 


Thr 


Ala 


Glu 


Asn 


He 


Gly 








20 










25 










30 




Pro 


Ser 


Asp 


Ser 


Phe 


Asp 


Gly 


Ser 


Thr 


Asn 


Thr 


Gly Thr 


Tyr 


Thr 


Pro 






35 










40 










45 






Lys 


Asn 


Thr 


Thr 


Thr 


Gly 


He 


Asp 


Tyr 


Thr 


Leu 


Thr 


Gly 


Asp 


He 


Thr 




50 










55 


* 








60 






Leu 


Gin 


Asn 


Leu 


Gly 


Asp 


Ser 


Ala 


Ala 


Leu 


Thr 


Lys 


Gly 


Cys 


Phe 


Ser 


65 










70 










75 








80 


Asp 


Thr 


Thr 


Glu 


Ser 


Leu 


Ser 


Phe 


Ala 


Gly 


Lys 


Gly Tyr 


Ser 


Leu 


Ser 










85 










90 










95 




Phe 


Leu 


Asn 


He 
100 


Lys 


Ser 


Ser 


Ala 


Glu 
105 


Gly 


Ala 


Ala 


Leu 


Ser 
110 


Val 


Thr 


Thr 


Asp 


Lys 
115 


Asn 


Leu 


Ser 


Leu 


Thr 
120 


Gly 


Phe 


Ser 


Ser 


Leu 
125 


Thr 


Phe 


Leu 


Ala 


Ala 


Pro 


Ser 


Ser 


Val 


He 


Thr 


Thr 


Pro 


Ser 


Gly 


Lys 


Gly Ala 


Val 




130 










135 










140 










Lys 


Cys 


Gly 


Gly 


Asp 


Leu 


Thr 


Phe 


Asp 


Asn 


Asn 


Gly Thr 


He 


Leu 


Phe 


145 










150 










155 










160 


Lys 


Gin 


Asp 


Tyr 


Cys 
165 


Glu 


Glu 


Asn 


Gly 


Gly 
170 


Ala 


He 


Ser 


Thr 


Lys 
175 


Asn 


Leu 


Ser 


Leu 


Lys 


Asn 


Ser 


Thr 


Gly 


Ser 


He 


Ser 


Phe 


Glu 


Gly Asn 


Lys 








180 










185 










190 




Ser 


Ser 


Ala 


Thr 


Gly 


Lys 


Lys 


Gly 


Gly 


Ala 


He 


Cys 


Ala 


Thr 


Gly Thr 






195 










200 










205 








Val 


Asp 
210 


He 


Thr 


Asn 


Asn 


Thr 
215 


Ala 


Pro 


Thr 


Leu 


Phe 
220 


Ser 


Asn 


Asn 


He 


Ala 


Glu 


Ala 


Ala 


Gly 


Gly 


Ala 


He 


Asn 


Ser 


Thr 


Gly Asn 


Cys 


Thr 


He 


225 










230 










235 








240 


Thr 


Gly 


Asn 


Thr 


Ser 
245 


Leu 


Val 


Phe 


Ser 


Glu 
250 


Asn 


Ser 


Val 


Thr 


Ala 
255 


Thr 


Ala 


Gly 


Asn 


Gly 
260 


Gly 


Ala 


Leu 


Ser 


Gly 
265 


Asp 


Ala 


Asp 


Val 


Thr 
270 


He 


Ser 


Gly 


Asn 


Gin 


Ser 


Val 


Thr 


Phe 


Ser 


Gly 


Asn 


Gin 


Ala 


Val 


Ala 


Asn 


Gly 






275 










280 










285 






Gly 


Ala 


lie 


Tyr 


Ala 


Lys 


Lys 


Leu 


Thr 


Leu 


Ala 


Ser Gly 


Gly Gly Gly 




290 










295 










300 










Gly 


He 


Ser 


Phe 


Ser 


Asn 


Asn 


He 


Val 


Gin 


Gly 


Thr 


Thr 


Ala 


Gly Asn 


305 










310 










315 










320 


Gly 


Gly 


Ala 


He 


Ser 
325 


He 


Leu 


Ala 


Ala 


Gly 
330 


Glu 


Cys 


Ser 


Leu 


Ser 
335 


Ala 


Glu 


Ala 


Gly 


Asp 
340 


He 


Thr 


Phe 


Asn 


Gly 
345 


Asn 


Ala 


He 


Val 


Ala 
350 


Thr 


Thr 
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Pro 


Gin 


Thr 


Thr 


Lys 


Arg 


Asn 


Ser 


He 


Asp 


He 


Gly 


Ser 


Thr 


Ala 


Lys 






355 










360 










365 








He 


Thr 


Asn 


Leu 


Arg 


Ala 


He 


Ser 


Gly 


His 


Ser 


He 


Phe 


Phe 


Tyr 


Asp 




370 










375 










380 










Pro 


He 


Thr 


Ala 


Asn 


Thr 


Ala 


Ala 


Asp 


Ser 


Thr 


Asp 


Thr 


Leu 


Asn 


Leu 


365 










390 










395 










400 


Asn 


Lys 


Ala 


Asp Ala Gly Asn 


Ser 


Thr 


Asp 


Tyr 


Ser 


Gly 


Ser 


He 


Val 










405 










410 










415 




Phe 


Ser 


Gly 


Glu 


Lys 


Leu 


Ser 


Glu 


Asp 


Glu 


Ala 


Lys 


Val 


Ala 


Asp 


Asn 








420 










425 










430 






Leu 


Thr 


Ser 


Thr 


Leu 


Lys 


Gin 


Pro 


Val 


Thr 


Leu 


Thr 


Ala 


Gly 


Asn 


Leu 






435 










440 










445 








Val 


Leu 


Lys 


Arg 


Gly Val 


Thr 


Leu 


Asp 


Thr 


Lys 


Gly 


Phe 


Thr 


Gin 


Thr 




450 










455 










460 










Ala 


Gly 


Ser 


Ser 


Val 


lie 


Met 


Asp 


Ala 


Gly 


Thr 


Thr 


Leu 


Lys 


Ala 


Ser 


465 










470 










475 










480 


Thr 


Glu 


Glu 


Val 


Thr 


Leu 


Thr 


Gly 


Leu 


Ser 


He 


Pro 


Val 


Asp 


Ser 


Leu 










4.85 










-490 










-4-9-5- 




Gly 


Glu 


Gly 


Lys 


Lys 


Val 


Val 


He 


Ala 


Ala 


Ser 


Ala 


Ala 


Ser 


Lys 


Asn 








500 










505 










510 






Val 


Ala 


Leu 


Ser 


Gly 


Pro 


He 


Leu 


Leu 


Leu 


Asp 


Asn 


Gin 


Gly 


Asn 


Ala 






515 










520 










525 








Tyr 


Glu 


Asn 


His 


Asp 


Leu 


Gly 


Lys 


Thr 


Gin 


Asp 


Phe 


Ser 


Phe 


Val 


Gin 




530 










535 










540 










Leu 


Ser 


Ala 


Leu 


Gly Thr 


Ala 


Thr 


Thr 


Thr 


Asp 


val 


Pro 


Ala 


Val 


Pro 


545 










550 










555 










560 


Thr 


Val 


Ala 


Thr 


Pro 


Thr 


His 


Tyr 


Gly 


Tyr 


Gin 


Gly 


Thr 


Trp 


Gly 


Met 










565 










570 










575 




Thr 


Trp 


Val 


Asp 


Asp 


Thr 


Ala 


Ser 


Thr 


Pro 


Lys 


Thr 


Lys 


Thr 


Ala 


Thr 








580 










585 










590 






Leu 


Ala 


Trp 


Thr 


Asn 


Thr 


Gly 


Tyr 


Leu 


Pro 


Asn 


Pro 


Glu 


Arg 


Gin 


Gly 






595 










600 










605 








Pro 


Leu 


Val 


Pro 


Asn 


Ser 


Leu 


Trp 


Gly 


Ser 


Phe 


Ser 


Asp 


He 


Gin 


Ala 




610 










615 










620 










He 


Gin Gly 


Val 


He 


Glu 


Arg 


Ser 


Ala 


Leu 


Thr 


Leu 


Cys 


Ser 


Asp 


Arg 


625 










630 










635 










640 


Gly 


Phe 


Trp 


Ala 


Ala 


Gly Val 


Ala 


Asn 


Phe 


Leu 


Asp 


Lys 


Asp 


Lys 


Lys 










645 










650 










655 




Gly 


Glu 


Lys 


Arg 


Lys 


Tyr 


Arg 


His 


Lys 


Ser 


Gly 


Gly 


Tyr 


Ala 


He 


Gly 








660 










665 










670 






Gly 


Ala 


Ala 


Gin 


Thr 


Cys 


Ser 


Glu 


Asn 


Leu 


He 


Ser 


Phe 


Ala 


Phe 


Cys 






675 










680 










685 






Gin 


Leu 


Phe 


Gly 


Ser 


Asp 


Lys 


Asp 


Phe 


Leu 


Val 


Ala 


Lys 


Asn 


His 


Thr 




690 










695 










700 










Asp 


Thr 


Tyr 


Ala 


Gly Ala 


Phe 


Tyr 


He 


Gin 


His 


He 


Thr 


Glu 


Cys 


Ser 


705 










710 










715 










720 


Gly 


Phe 


He 


Gly 


Cys 


Leu 


Leu 


Asp 


Lys 


Leu 


Pro 


Gly 


Ser 


Trp 


Ser 


His 










725 










730 










735 




Lys 


Pro 


Leu 


Val 


Leu 


Glu 


Gly 


Gin 


Leu 


Ala 


Tyr 


Ser 


His 


Val 


Ser 


Asn 








740 










745 










750 






Asp 


Leu 


Lys 


Thr 


Lys 


Tyr 


Thr 


Ala 


Tyr 


Pro 


Glu 


Val 


Lys 


Gly 


Ser 


Trp 






755 










760 










765 








Gly 


Asn 


Asn 


Ala 


Phe 


Asn 


Met 


Met 


Leu 


Gly 


Ala 


Ser 


Ser 


His 


Ser 


Tyr 




770 










775 










780 








Pro 


Glu 


Tyr 


Leu 


His 


Cys 


Phe 


Asp 


Thr 


Tyr 


Ala 


Pro 


Tyr 


He 


Lys 


Leu 


785 










790 










795 










800 


Asn 


Leu 


Thr 


Tyr 


He 


Arg 


Gin 


Asp 


Ser 


Phe 


Ser 


Glu 


Lys 


Gly 


Thr 


Glu 
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805 










ft 1 0 

Oil/ 










QIC 




Gly Arg 


Ser 


Phe 


Asp 






As 11 


Leu 


irne 


Asn 


Leu 


Ser 


Leu 


Pro 


lie 








820 










82 5 










OTA 






Gly 


'Val 


Lys 


Phe 


Glu 


Lys 




OCi 


Asp 


uys 


Asn 


Asp 


Phe 


Ser 


Tyr 


Asp 






835 










840 










845 






Leu 


Thr 


Leu 


Ser 




Val 






Leu 


Tip 


Arg 


Asn 


Asp 


Pro 


Lys 


Cys 




850 










855 










860 








Thr 


Thr 


Ala 


Leu 


Val 


lie 


Ser 


Gly 


Ala 


Ser 


Trp 


Glu 


Thr 


Tyr 


Ala 


Asn 


865 










870 










875 








880 


As n 


Leu 


Ala 


Arg 


Gin 


Ala 


Leu 


Gin 


Val 


Arg 


Ala 


Gly Ser 


His 


Tyr 


Ala 










885 










890 










895 




Phe 


Ser 


Pro 


Met 


Phe 


Glu 


Val 


Leu 


Gly 


Gin 


Phe 


Val 


Phe 


Glu 


Val 


Arg 








900 










905 










910 





Gly Ser 



(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 120 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME /KEY : Coding Sequence 

(B) LOCATION: 1...1200 
(D) OTHER INFORMATION: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

GAT CCT AAA AAT AAA GAG TAC ACA GGG ACC ATA CTC TTT TCT GGA GAA 4 8 

Asp Pro Lys Asn Lys Glu Tyr Thr Gly Thr lie Leu Phe Ser Gly Glu 
1 5 10 15 

AAG AGT CTA GCA AAC GAT CCT AGG GAT TTT AAA TCT ACA ATC CCT CAG 96 

Lys Ser Leu Ala Asn Asp Pro Arg Asp Phe Lys Ser Thr lie Pro Gin 
20 25 30 

AAC GTC AAC CTG TCT GCA GGA TAC TTA GTT ATT AAA GAG GGG GCC GAA 14 4 

Asn Val Asn Leu Ser Ala Gly Tyr Leu Val He Lys Glu Gly Ala Glu 
35 40 45 

GTC ACA GTT TCA AAA TTC ACG CAG TCT CCA GGA TCG CAT TTA GTT TTA 192 

Val Thr Val Ser Lys Phe Thr Gin Ser Pro Gly Ser His Leu Val Leu 
50 55 60 

GAT TTA GGA ACC AAA CTG ATA GCC TCT AAG GAA GAC ATT GCC ATC ACA 24 0 

Asp Leu Gly Thr Lys Leu He Ala Ser Lys Glu Asp He Ala He Thr 

65 70 75 80 

GGC CTC GCG ATA GAT ATA GAT AGC TTA AGC TCA TCC TCA ACA GCA GCT 2 88 

Gly Leu Ala He Asp He Asp Ser Leu Ser Ser Ser Ser Thr Ala Ala 
85 90 95 
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GTT ATT AAA GCA AAC ACC GCA AAT AAA CAG ATA TCC GTG ACG GAC TCT 33 6 

Val lie Lys Ala Asn Thr Ala Asn Lys Gin lie Ser Val Thr Asp Ser 

100 ^ 105 110 

ATA GAA CTT ATC TCG CCT ACT GGC AAT GCC TAT GAA GAT CTC AGA ATG 3 84 

lie Glu Leu lie Ser Pro Thr Gly Asn Ala Tyr Glu Asp Leu Arg Met 
115 120 125 

AGA AAT TCA CAG ACG TTC CCT CTG CTC TCT TTA GAG CCT GGA GCC GGG 43 2 

Arg Asn Ser Gin Thr Phe Pro Leu Leu Ser Leu Glu Pro Gly Ala Gly 
130 135 140 

GGT AGT GTG ACT GTA ACT GCT GGA GAT TTC CTA CCG GTA AGT CCC CAT 480 

Gly Ser Val Thr Val Thr Ala Gly Asp Phe Leu Pro Val Ser Pro His 
145 150 155 160 

TAT GGT TTT CAA GGC AAT TGG AAA TTA GCT TGG ACA GGA ACT GGA AAC 52 8 

-Tyr -G-l-y -Phe Gin- -Gly Asn- T-rp -Lys- -Leu -A-l-a- T-rp-T-h-r-G-l-y Thr -G-l-y -Asn 
165 170 175 

AAA GTT GGA GAA TTC TTC TGG GAT AAA ATA AAT TAT AAG CCT AGA CCT 576 

Lys Val Gly Glu Phe Phe Trp Asp Lys lie Asn Tyr Lys Pro Arg Pro 

180 185 190 

GAA AAA GAA GGA AAT TTA GTT CCT AAT ATC TTG TGG GGG AAT GCT GTA 624 

Glu Lys Glu Gly Asn Leu Val Pro Asn lie Leu Trp Gly Asn Ala Val 
195 200 205 

AAT GTC AGA TCC TTA ATG CAG GTT CAA GAG ACC CAT GCA TCG AGC TTA 672 

Asn Val Arg Ser Leu Met Gin Val Gin Glu Thr His Ala Ser Ser Leu 
210 215 220 

CAG ACA GAT CGA GGG CTG TGG ATC GAT GGA ATT GGG AAT TTC TTC CAT 72 0 

Gin Thr Asp Arg Gly Leu Trp He Asp Gly He Gly Asn Phe Phe His 
225 230 235 240 

GTA TCT GCC TCC GAA GAC AAT ATA AGG TAC CGT CAT AAC AGC GGT GGA 76 8 

Val Ser Ala Ser Glu Asp Asn He Arg Tyr Arg His Asn Ser Gly Gly 
245 250 255 

TAT GTT CTA TCT GTA AAT AAT GAG ATC ACA CCT AAG CAC TAT ACT TCG 816 

Tyr Val Leu Ser Val Asn Asn Glu He Thr Pro Lys His Tyr Thr Ser 

260 265 270 

ATG GCA TTT TCC CAA CTC TTT AGT AGA GAC AAA GAC TAT GCG GTT TCC 864 

Met Ala Phe Ser Gin Leu Phe Ser Arg Asp Lys Asp Tyr Ala Val Ser 
275 280 285 

AAC AAC GAA TAC AGA ATG TAT TTA GGA TCG TAT CTC TAT CAA TAT ACA 912 

Asn Asn Glu Tyr Arg Met Tyr Leu Gly Ser Tyr Leu Tyr Gin Tyr Thr 
290 295 300 

ACC TCC CTA GGG AAT ATT TTC CGT TAT GCT TCG CGT AAC CCT AAT GTA 960 

Thr Ser Leu Gly Asn He Phe Arg Tyr Ala Ser Arg Asn Pro Asn Val 
305 310 315 320 

AAC GTC GGG ATT CTC TCA AGA AGG TTT CTT CAA AAT CCT CTT ATG ATT 1008 
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Asn Val Gly lie Leu Ser 
325 

TTT CAT TTT TTG TGT GCT 
Phe His Phe Leu Cys Ala 
340 

GAC TAC GCA AAT TTC CCT 
Asp Tyr Ala Asn Phe Pro 
355 

TGG GCT ATA AAA TGC GGA 
Trp Ala lie Lys Cys Gly 
370 



82 

Arg Arg Phe Leu Gin Asn 
330 

TAT GGT CAT GCC ACC AAT 
Tyr Gly His Ala Thr Asn 
345 

ATG GTG AAA AAC AGC TGG 
Met Val Lys Asn Ser Trp 
360 

GGG AGC ATG CCT CTA TTG 
Gly Ser Met Pro Leu Leu 
375 380 



Pro Leu Met lie 
335 

GAT ATG AAA ACA 105 6 
Asp Met Lys Thr 
350 

AGA AAC AAT TGT 1104 

Arg Asn Asn Cys 

365 

GTA TTT GAA AAC 1152 
Val Phe Glu Asn 



GGA AAA CTT TTC CAA GGT GCC ATC CCA TTT ATG AAA CTA CAA TTA GTT 
Gly Lys Leu Phe Gin Gly Ala lie Pro Phe Met Lys Leu Gin Leu Val 
385 390 395 400 



(2) INFORMATION FOR SEQ ID NO : 2 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 
(v) FRAGMENT TYPE: internal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 8 : 



Asp 


Pro 


Lys 


Asn 


Lys 


Glu 


Tyr 


Thr 


Gly 


Thr 


He 


Leu 


Phe 


Ser 


Gly 


Glu 


1 








5 










10 










15 




Lys 


Ser 


Leu 


Ala 


Asn 


Asp 


Pro 


Arg 


Asp 


Phe 


Lys 


Ser 


Thr 


He 


Pro 


Gin 








20 










25 










30 






Asn 


Val 


Asn 


Leu 


Ser 


Ala 


Gly 


Tyr 


Leu 


Val 


He 


Lys 


Glu 


Gly Ala 


Glu 






35 










40 










45 








Val 


Thr 


Val 


Ser 


Lys 


Phe 


Thr 


Gin 


Ser 


Pro 


Gly 


Ser 


His 


Leu 


Val 


Leu 




50 










55 










60 










Asp 


Leu 


Gly 


Thr 


Lys 


Leu 


He 


Ala 


Ser 


L ys 


Glu 


Asp 


He 


Ala 


He 


Thr 


65 










70 










75 










80 


Gly 


Leu 


Ala 


He 


Asp 


He 


Asp 


Ser 


Leu 


Ser 


Ser 


Ser 


Ser 


Thr 


Ala 


Ala 










85 










90 










95 




Val 


He 


Lys 


Ala 


Asn 


Thr 


Ala 


Asn 


Lys 


Gin 


He 


Ser 


Val 


Thr 


Asp 


Ser 


He 






100 










105 










110 




Glu 


Leu 


He 


Ser 


Pro 


Thr 


Gly 


Asn 


Ala 


Tyr 


Glu 


Asp 


Leu 


Arg 


Met 






115 










120 










125 






Arg 


Asn 


Ser 


Gin 


Thr 


Phe 


Pro 


Leu 


Leu 


Ser 


Leu 


Glu 


Pro 


Gly Ala 


Gly 




130 










135 










140 








Gly 


Ser 


Val 


Thr 


Val 


Thr 


Ala 


Gly 


Asp 


Phe 


Leu 


Pro 


Val 


Ser 


Pro 


His 


145 










150 










155 










160 


Tyr 


Gly 


Phe 


Gin 


Gly 


Asn 


Trp 


Lys 


Leu 


Ala 


Trp 


Thr 


Gly 


Thr 


Gly Asn 










165 










170 










175 




Lys 


Val 


Gly 


Glu 


Phe 


Phe 


Trp 


Asp 


Lys 


He 


Asn 


Tyr 


Lys 


Pro 


Arg 


Pro 



185 190 
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n 1 1 1 

\J lU 


Lys 






Asn 


Leu 


Val 




Asn 


lie 


Leu 


Trp 


Gly 


Asn 


Ala 


val 






195 










9 n n 










0 r\ c 








Asn 


val 




o c. X 


Leu 


Met- 
ric t- 


\J XII 


V a _L 


oin 


pi,, 
biu 


1 nr 


HIS 


71 1 , 

Aia 


Ser 


Ser 


Leu 














c 

£. 1j 










ZZV 










Pin 

bin 


Thr 


Asp 


Arg 


ft! \r 


Leu 


Trp 


T 1 o 

lie 


Asp 


pi , f 
tjiy 


Tl - 

lie 


pl 

oly 


Asn 


Pne 


Pne 


His 












9 f) 
z ^ VJ 










Z J O 










2 40 


vai 


Ser 


mi a 


S er 


vjIU 


Asp 


Asn 


lie 


Arg 


Tyr 


Arg 


His 


Asn 


Ser 


Gly 


Gly 










z *± d 










TEA 










255 




Tyr 




Leu 




vai 


Asn 


Asn 


p T 1 1 
CjIU 


lie 


Thr 


Pro 


Lys 


His 


Tyr 


Thr 


Ser 


















*5 C ET 










270 






Met 




riie 


Ser 


pi „ 
tjin 


Leu 


Phe 


Ser 


Arg 


Asp 


Lys 


Asp 


Tyr 


Ala 


Val 


Ser 
















T O A 










2 85 








Asn 


Asn 


CjIU 


Tyr 


Arg 


Met 


Tyr 


Leu 


Gly 


Ser 


Tyr 


Leu 


Tyr 


Gin 


Tyr 


Thr 














IOC 










300 










Thr 


Ser 


Leu 


(jly 


Asn 


He 


Pne 


Arg 


Tyr 


Ala 


Ser 


Arg 


Asn 


Pro 


Asn 


Val 


3 05 










310 










315 










320 


Asn 


val 


Gly 


I le 


Leu 


Ser 


Arg 


Arg 


Phe 


Leu 


Gin 


Asn 


Pro 


Leu 


Met 


He 










"5 *"i C 










3 30 










335 




Phe 


His 


Pne 


Leu 


Cys 


Ala 


Tyr 


Gly 


His 


Ala 


Thr 


Asn. 


Asp 


Met 


Lys 


Thr 








340 










345 










350 






Asp 


Tyr 


Ala 


Asn 


Phe 


Pro 


Met 


Val 


Lys 


Asn 


Ser 


Trp 


Arg 


Asn 


Asn 


Cys 






355 










360 










365 








Trp 


Ala 


He 


Lys 


Cys 


Gly 


Gly 


Ser 


Met 


Pro 


Leu 


Leu 


Val 


Phe 


Glu 


Asn 




370 










375 










380 










Gly 


Lys 


Leu 


Phe 


Gin 


Gly 


Ala 


He 


Pro 


Phe 


Met 


Lys 


Leu 


Gin 


Leu 


Val 


385 










390 










395 










400 



(2) INFORMATION FOR SEQ ID NO : 2 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1830 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 
(ix) FEATURE: 

(A) NAME /KEY : Coding Sequence 

(B) LOCATION: 1...1830 
(D) OTHER INFORMATION: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

GAT CTC ACA TTA GGG AGT CGT GAC AGT TAT AAT GGT GAT ACA AGC ACC 4 8 

Asp Leu Thr Leu Gly Ser Arg Asp Ser Tyr Asn Gly Asp Thr Ser Thr 
1 5 10 15 

ACA GAA TTT ACT CCT AAA GCG GCA ACT TCT GAT GCT AGT GGC ACG ACC 96 
Thr Glu Phe Thr Pro Lys Ala Ala Thr Ser Asp Ala Ser Gly Thr Thr 
20 25 30 

TAT ATT CTC GAT GGG GAT GTC TCG ATA AGC CAA GCA GGG AAA CAA ACG 14 4 

Tyr He Leu Asp Gly Asp Val Ser He Ser Gin Ala Gly Lys Gin Thr 
35 40 45 
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AGC TTA ACC ACA AGT TGT TTT TCT AAC ACT GCA GGA AAT CTT ACC TTC 192 
Ser Leu Thr Thr Ser Cys Phe Ser Asn Thr Ala Gly Asn Leu Thr Phe 
.50 55 60 

TTA GGG AAC GGA TTT TCT CTT CAT TTT GAC AAT ATT ATT TCG TCT ACT 240 
Leu Gly Asn Gly Phe Ser Leu His Phe Asp Asn lie lie Ser Ser Thr 
65 70 75 80 

GTT GCA GGT GTT GTT GTT AGC AAT ACA GCA GCT TCT GGG ATT ACG AAA 2 88 

Val Ala Gly Val Val Val Ser Asn Thr Ala Ala Ser Gly lie Thr Lys 
85 90 95 

TTC TCA GGA TTT TCA ACT CTT CGG ATG CTT GCA GCT CCT AGG ACC ACA 33 6 

Phe Ser Gly Phe Ser Thr Leu Arg Met Leu Ala Ala Pro Arg Thr Thr 
100 105 110 

GGT AAA GGA GCC ATT AAA ATT ACC GAT GGT CTG GTG TTT GAG AGT ATA * 3 84 
Gly Lys Gly Ala lie Lys He Thr Asp Gly Leu Val Phe Glu Ser He 

120 125 

GGG AAT CTT GAT CCG ATT ACT GTA ACA GGA TCG ACA TCT GTT GCT GAT 43 2 

Gly Asn Leu Asp Pro He Thr Val Thr Gly Ser Thr Ser Val Ala Asp 
13 0 135 140 

GCT CTC AAT ATT AAT AGC CCT GAT ACT GGA GAT AAC AAA GAG TAT ACG 4 80 

Ala Leu Asn He Asn Ser Pro Asp Thr Gly Asp Asn Lys Glu Tyr Thr 
145 150 155 160 

GGA ACC ATA GTC TTT TCT GGA GAG AAG CTC ACG GAG GCA GAA GCT AAA 52 8 

Gly Thr He Val Phe Ser Gly Glu Lys Leu Thr Glu Ala Glu Ala Lys 
165 170 175 

GAT GAG AAG AAC CGC ACT TCT AAA TTA CTT CAA AAT GTT GCT TTT AAA 576 
Asp Glu Lys Asn Arg Thr Ser Lys Leu Leu Gin Asn Val Ala Phe Lys 
180 185 190 

AAT GGG ACT GTA GTT TTA AAA GGT GAT GTC GTT TTA AGT GCG AAC GGT 62 4 

Asn Gly Thr Val Val Leu Lys Gly Asp Val Val Leu Ser Ala Asn Gly 
19 5 200 205 

TTC TCT CAG GAT GCA AAC TCT AAG TTG ATT ATG GAT TTA GGG ACG TCG 672 
Phe Ser Gin Asp Ala Asn Ser Lys Leu He Met Asp Leu Gly Thr Ser 
210 215 220 

TTG GTT GCA AAC ACC GAA AGT ATC GAG TTA ACG AAT TTG GAA ATT AAT 72 0 

Leu Val Ala Asn Thr Glu Ser He Glu Leu Thr Asn Leu Glu He Asn 
225 230 235 240 

ATA GAC TCT CTC AGG AAC GGG AAA AAG ATA AAA CTC AGT GCT GCC ACA 76 8 

He Asp Ser Leu Arg Asn Gly Lys Lys He Lys Leu Ser Ala Ala Thr 
245 250 255 

GCT CAG AAA GAT ATT CGT ATA GAT CGT CCT GTT GTA CTG GCA ATT AGC 816 
Ala Gin Lys Asp He Arg He Asp Arg Pro Val Val Leu Ala He Ser 
2 &0 265 270 

GAT GAG AGT TTT TAT CAA AAT GGC TTT TTG AAT GAG GAC CAT TCC TAT 8 64 
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Asp Glu Ser Phe Tyr Gin Asn Gly Phe Leu Asn Glu Asp His Ser Tyr 
275 280 285 

GAT" GGG ATT CTT GAG TTA GAT GCT GGG AAA GAC ATC GTG ATT TCT GCA 912 
Asp Gly lie Leu Glu Leu Asp Ala Gly Lys Asp lie Val lie Ser Ala 
290 295 300 

GAT TCT CGC AGT ATA GAT GCT GTA CAA TCT CCG TAT GGC TAT CAG GGA 960 
Asp Ser Arg Ser He Asp Ala Val Gin Ser Pro Tyr Gly Tyr Gin Gly 
305 310 315 320 

AAG TGG ACG ATC AAT TGG TCT ACT GAT GAT AAG AAA GCT ACG GTT TCT 1008 
Lys Trp Thr He Asn Trp Ser Thr Asp Asp Lys Lys Ala Thr Val Ser 
325 330 335 

TGG GCG AAG CAG AGT TTT AAT CCC ACT GCT GAG CAG GAG GCT CCG TTA 1056 
Trp Ala Lys Gin Ser Phe Asn Pro Thr Ala Glu Gin Glu Ala Pro Leu 

340 ~3"4"5 350 

GTT CCT AAT CTT CTT TGG GGT TCT TTT ATA GAT GTT CGT TCC TTC CAG 1104 
Val Pro Asn Leu Leu Trp Gly Ser Phe He Asp Val Arg Ser Phe Gin 
355 360 365 

AAT TTT ATA GAG CTA GGT ACT GAA GGT GCT CCT TAC GAA AAG AGA TTT 1152 
Asn Phe He Glu Leu Gly Thr Glu Gly Ala Pro Tyr Glu Lys Arg Phe 
370 375 380 

TGG GTT GCA GGC ATT TCC AAT GTT TTG CAT AGG AGC GGT CGT GAA AAT 1200 
Trp Val Ala Gly He Ser Asn Val Leu His Arg Ser Gly Arg Glu Asn 
385 390 395 400 

CAA AGG AAA TTC CGT CAT GTG AGT GGA GGT GCT GTA GTA GGT GCT AGC 124 8 
Gin Arg Lys Phe Arg His Val Ser Gly Gly Ala Val Val Gly Ala Ser 
405 410 415 

ACG AGG ATG CCG GGT GGT GAT ACC TTG TCT CTG GGT TTT GCT CAG CTC 12 96 
Thr Arg Met Pro Gly Gly Asp Thr Leu Ser Leu Gly Phe Ala Gin Leu 
420 425 430 

TTT GCG CGT GAC AAA GAC TAC TTT ATG AAT ACC AAT TTC GCA AAG ACC 13 44 
Phe Ala Arg Asp Lys Asp Tyr Phe Met Asn Thr Asn Phe Ala Lys Thr 
435 440 445 

TAC GCA GGA TCT TTA CGT TTG CAG CAC GAT GCT TCC CTA TAC TCT GTG 13 92 
Tyr Ala Gly Ser Leu Arg Leu Gin His Asp Ala Ser Leu Tyr Ser Val 
450 455 460 

GTG AGT ATC CTT TTA GGA GAG GGA GGA CTC CGC GAG ATC CTG TTG CCT 144 0 
Val Ser He Leu Leu Gly Glu Gly Gly Leu Arg Glu He Leu Leu Pro 
465 470 475 480 

TAT GTT TCC AAT ACT CTG CCG TGC TCT TTC TAT GGG CAG CTT AGC TAC 14 88 
Tyr Val Ser Asn Thr Leu Pro Cys Ser Phe Tyr Gly Gin Leu Ser Tyr 
485 490 495 

GGC CAT ACG GAT CAT CGC ATG AAG ACC GAG TCT CTA CCC CCC CCC CCC 153 6 
Gly His Thr Asp His Arg Met Lys Thr Glu Ser Leu Pro Pro Pro Pro 
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500 505 510 

CCG ACG CTC TCG ACG GAT CAT ACT TCT TGG GGA GGA TAT GTC TGG GCT 1584 
Pro Thr Leu Ser Thr Asp His Thr Ser Trp Gly Gly Tyr Val Trp Ala 
51 5 520 525 

GGA GAG CTG GGA ACT CGA GTT GCT GTT GAA AAT ACC AGC GGC AGA GGA 1632 
Gly Glu Leu Gly Thr Arg Val Ala Val Glu Asn Thr Ser Gly Arg Gly 
530 535 540 

TTT TTC CGA GAG TAC ACT CCA TTT GTA AAA. GTC CAA GCT GTT TAC TCG 16 80 
Phe Phe Arg Glu Tyr Thr Pro Phe Val Lys Val Gin Ala Val Tyr Ser 
545 550 555 560 

CGC CAA GAT AGC TTT GTT GAA CTA GGA GCT ATC AGT CGT GAT TTT AGT 172 8 
Arg Gin Asp Ser Phe Val Glu Leu Gly Ala He Ser Arg Asp Phe Ser 
565 570 575 

GAT TCG CAT CTT TAT AAC CTT GCG ATT CCT CTT GGA ATC AAG TTA GAG 1776 
Asp Ser His Leu Tyr Asn Leu Ala He Pro Leu Gly He Lys Leu Glu 
580 585 590 

AAA CGG TTT GCA GAG CAA TAT TAT CAT GTT GTT GCG ATG TAT TCT CCA 182 4 

Lys Arg Phe Ala Glu Gin Tyr Tyr His Val Val Ala Met Tyr Ser Pro 
595 600 605 



GAT GTT 
Asp Val 
610 



1830 



(2) INFORMATION FOR SEQ ID NO : 3 0 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 610 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: internal 





(xi) J 


SEQUENCE 


DESCRIPTION: 


: SEQ ID 


NO: : 


30 : 








Asp 


Leu 


Thr 


Leu 


Gly 


Ser 


Arg 


Asp 


Ser 


Tyr 


Asn 


Gly Asp 


Thr 


Ser 


Thr 


1 








5 










10 








15 




Thr 


Glu 


Phe 


Thr 


Pro 


Lys 


Ala 


Ala 


Thr 


Ser 


Asp 


Ala Ser 


Gly Thr 


Thr 




He 




20 










25 








30 






Tyr 


Leu 


Asp 


Gly 


Asp 


Val 


Ser 


He 


Ser 


Gin 


Ala Gly Lys 


Gin 


Thr 






35 










40 








45 








Ser 


Leu 


Thr 


Thr 


Ser 


Cys 


Phe 


Ser 


Asn 


Thr 


Ala 


Gly Asn 


Leu 


Thr 


Phe 




50 










55 










60 








Leu 


Gly 


Asn 


Gly 


Phe 


Ser 


Leu 


His 


Phe 


Asp 


Asn 


He He 


Ser 


Ser 


Thr 


65 










70 










75 








80 


Val 


Ala 


Gly 


Val 


Val 


Val 


Ser 


Asn 


Thr 


Ala 


Ala 


Ser Gly 


He 


Thr 


Lys 


Phe 








85 










90 








95 


Ser 


Gly 


Phe 


Ser 


Thr 


Leu 


Arg 


Met 


Leu 


Ala 


Ala Pro 


Arg 


Thr 


Thr 
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100 105 110 

Gly Lys Gly Ala He Lys He Thr Asp Gly Leu Val Phe Glu Ser He 

115 120 125 

Gly Asn Leu Asp Pro He Thr Val Thr Gly Ser Thr Ser Val Ala Asp 

130 135 140 

Ala Leu Asn He Asn Ser Pro Asp Thr Gly Asp Asn Lys Glu Tyr Thr 
145 150 155 160 

Gly Thr He Val Phe Ser Gly Glu Lys Leu Thr Glu Ala Glu Ala Lys 

165 170 - 175 

Asp Glu Lys Asn Arg Thr Ser Lys Leu Leu Gin Asn Val Ala Phe Lys 

180 185 190 

Asn Gly Thr Val Val Leu Lys Gly Asp Val Val Leu Ser Ala Asn Gly 

195 200 205 

Phe Ser Gin Asp Ala Asn Ser Lys Leu He Met Asp Leu Gly Thr Ser 

210 215 220 

Leu Val Ala Asn Thr Glu Ser He Glu Leu Thr Asn Leu Glu He Asn 
225 230 235 240 



Tie ~A~sp~ "Ser" Leu Arg Asn ~Bly~ Lys Lys Tie Lys TTeu Ser ~Ala Ala Thr 

245 250 255 

Ala Gin Lys Asp He Arg He Asp Arg Pro Val Val Leu Ala lie Ser 

260 265 270 

Asp Glu Ser Phe Tyr Gin Asn Gly Phe Leu Asn Glu Asp His Ser Tyr 

275 280 285 

Asp Gly He Leu Glu Leu Asp Ala Gly Lys Asp He Val He Ser Ala 

290 295 300 

Asp Ser Arg Ser He Asp Ala Val Gin Ser Pro Tyr Gly Tyr Gin Gly 
305 310 315 320 

Lys Trp Thr lie Asn Trp Ser Thr Asp Asp Lys Lys Ala Thr Val Ser 

325 330 335 

Trp Ala Lys Gin Ser Phe Asn Pro Thr Ala Glu Gin Glu Ala Pro Leu 

340 345 350 

Val Pro Asn Leu Leu Trp Gly Ser Phe He Asp Val Arg Ser Phe Gin 

355 360 365 

Asn Phe He Glu Leu Gly Thr Glu Gly Ala Pro Tyr Glu Lys Arg Phe 

370 375 380 

Trp Val Ala Gly He Ser Asn Val Leu His Arg Ser Gly Arg Glu Asn 
385 390 395 400 

Gin Arg Lys Phe Arg His Val Ser Gly Gly Ala Val Val Gly Ala Ser 

405 410 415 

Thr Arg Met Pro Gly Gly Asp Thr Leu Ser Leu Gly Phe Ala Gin Leu 

420 425 430 

Phe Ala Arg Asp Lys Asp Tyr Phe Met Asn Thr Asn Phe Ala Lys Thr 

435 440 445 

Tyr Ala Gly Ser Leu Arg Leu Gin His Asp Ala Ser Leu Tyr Ser Val 

450 455 460 

Val Ser lie Leu Leu Gly Glu Gly Gly Leu Arg Glu He Leu Leu Pro 
465 470 475 480 

Tyr Val Ser Asn Thr Leu Pro Cys Ser Phe Tyr Gly Gin Leu Ser Tyr 

485 490 495 

Gly His Thr Asp His Arg Met Lys Thr Glu Ser Leu Pro Pro Pro Pro 

500 505 510 

Pro Thr Leu Ser Thr Asp His Thr Ser Trp Gly Gly Tyr Val Trp Ala 

515 520 525 

Gly Glu Leu Gly Thr Arg Val Ala Val Glu Asn Thr Ser Gly Arg Gly 

530 535 540 

Phe Phe Arg Glu Tyr Thr Pro Phe Val Lys Val Gin Ala Val Tyr Ser 
545 550 555 560 
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Arg Gin Asp Ser 

Asp Ser His Leu 
580 

Lys Arg Phe Ala 
595 

Asp Val 
610 



Phe Val Glu Leu 
565 

Tyr Asn Leu Ala 

Glu Gin Tyr Tyr 
600 



Gly Ala lie Ser 
570 

lie Pro Leu Gly 
585 

His Val Val Ala 



Arg Asp Phe Ser 
575 

lie Lys Leu Glu 
590 

Met Tyr Ser Pro 
605 
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1. Species specific diagnostic test for identifying 
infection of a mammal, such as a human, with Chlamydia 
pneumoniae, said test comprising detecting in a patient or in 
a patient sample the presence of antibodies against one or 
more proteins from the outer membrane of Clamydia pneumoniae, 
said proteins being of a molecular weight of 100.3-89.6 kDa 
or of 56.1 kDa, or detecting the presence of nucleic acid 
fragments encoding said outer membrane proteins . 

2. Diagnostic test according to claim 1, wherein the outer 
membrane protein has the sequence as shown in SEQ ID NO: 2, 
SEQ ID NO: 4, SEQ ID NO : 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ 
ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ 
ID NO: 20, SEQ ID NO: 22, or in SEQ ID NO: 24, or a variant 
or subsequence thereof . 

3. Diagnostic test according to claim 1, wherein the nucleic 
acid fragment has the sequence shown in SEQ ID NO: 1, SEQ ID 
NO: 3, SEQ ID NO : 5, SEQ ID NO : 7, SEQ ID NO : 9, SEQ ID NO: 
11, SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 
19, SEQ ID NO: 21, or in SEQ ID NO: 23, or a variant or 
subsequence thereof . 

4 . Diagnostic test according to claim 3 wherein detection of 
nucleic acid fragments is obtained by using nucleic acid 
amplification . 

5. Diagnostic test according to claim 4, wherein detection 
of nucleic acid fragments is obtained by using polymerase 
chain reaction. 

6 . A nucleic acid fragment derived from Chlamydia pneumoniae 
comprising the nucleotide sequence SEQ ID NO: 1, SEQ ID NO: 
3, SEQ ID NO: 5, SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 11, 
SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, 
SEQ ID NO: 21, or SEQ ID NO: 23, or a variant or subsequence 
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of said nucleotide sequence which has a sequence homology of 
at least 50% with any of the sequences mentioned. 

7 . A protein derived from Chlamydia pneumoniae having the 
amino acid sequence shown in SEQ ID NO: 2, SEQ ID NO: 4, SEQ 
ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID 
NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO: 20, SEQ ID 
NO: 22, or SEQ ID NO: 24, or a variant or subsequence thereof 
having a sequence similarity of at least 50% and a similar 
biological function . 

8 . Polyclonal monospecific antibody against the protein 
with the sequence shown in SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID 
NO: 6, SEQ ID NO : 8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 
14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO: 20, SEQ ID NO: 
22, or SEQ ID NO: 24, or a variant or subsequence thereof. 

9 . A diagnostic kit for the diagnosis of infection of a 
mammal, such as a human, with Chlamydia pneumoniae, said kit 
comprising a protein with the amino acid sequence SEQ ID NO: 
2, SEQ ID NO: 4, SEQ ID NO : 6, SEQ ID NO: 8, SEQ ID NO: 10, 
SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, 
SEQ ID NO: 20, SEQ ID NO: 22, or SEQ ID NO: 24, or a variant 
or subsequence thereof . 

10. A diagnostic kit for the diagnosis of infection of a 
mammal, such as a human, with Chlamydia pneumoniae, said kit 
comprising antibodies against a protein with the amino acid 
sequence SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 

8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 
16, SEQ ID NO: 18, SEQ ID NO: 20, SEQ ID NO: 22, or SEQ ID 
NO: 24, or a variant or subsequence thereof. 

11 - A diagnostic kit for the diagnosis of infection of a 

mammal, such as a human, with Chlamydia pneumoniae, said kit 
comprising a nucleic acid fragment with the sequence SEQ ID 
NO: 1, SEQ ID NO: 3, SEQ ID NO : 5, SEQ ID NO : 7, SEQ ID NO: 

9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 
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17, SEQ ID NO: 19, SEQ ID NO: 21, or SEQ ID NO: 23, or. a 
variant or subsequence thereof . 

12. A composition for immunizing a mammal, such as a 
human, against Chlamydia pneumoniae , said composition 
comprising a protein with the amino acid sequence shown in 
SEQ ID NO: 2, SEQ ID NO : 4, SEQ ID NO : 6, SEQ ID NO : 8, SEQ 
ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ 
ID NO: 18, SEQ ID NO: 20, SEQ ID NO: 22, or SEQ ID NO: 24, or 
a variant or subsequence thereof. 

13 . Use of a protein with the s equen ce s hown in SEQ ID 
NO: 2, SEQ ID NO : 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 
10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 

18, SEQ ID NO: 20, SEQ ID NO: 22, or SEQ ID NO: 24, or a 
variant or subsequence thereof in diagnosis of infection of : a 
mammal, such as a human, with Chlamydia pneumoniae. 

14. Use of the protein with the sequence shown in SEQ ID 
NO: 2, SEQ ID NO : 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO:~ 
10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO 
18, SEQ ID NO: 20, SEQ ID NO: 22, or SEQ ID NO: 24 or a -v 
variant or subsequence thereof in an undenatured form, in 
diagnosis of infection of a mammal, such as a human, with 
Chlamydia pneumoniae. 

15. Use of a protein with the sequence shown in SEQ ID 
NO: 2, SEQ ID NO: 4, SEQ ID NO : 6, SEQ ID NO : 8, SEQ ID NO: 
10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 
18, SEQ ID NO: 20, SEQ ID NO: 22, or SEQ ID NO: 24, or a 
variant or subsequence thereof, for immunizing a mammal, such 
as a human, against Chlamydia pneumoniae. 

16 . Use of the protein with the sequence shown in SEQ ID 
NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO : 8, SEQ ID NO: 
10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 
18, SEQ ID NO: 20, SEQ ID NO: 22, or SEQ ID NO: 24, or a 
variant or subsequence thereof in an undenatured form, for 
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immunizing a mammal, such as a human, against Chlamydia 
pneumoniae . 

17 . Use of a nucleic acid fragment with the nucleotide 

sequence shown in SEQ ID NO: 1 SEQ ID NO: 3, SEQ ID NO: 5, 
5 SEQ ID NO: 7 , SEQ ID NO : 9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ 
ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, SEQ ID NO: 21, or 
SEQ ID NO: 23, or a variant or subsequence of said nucleotide 
sequence which has a sequence homology of at least 50% with 
any of the sequences mentioned for immunizing a mammal, such 
10 as a human, against Chlamydia pneumoniae. 
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Immunoblotting of C pneumoniae EB, lane 1-3 heated to 100°C in SDS-sample buffer, 
lane 4-6 unheated. Lane 1 reacted with rabbit anti C. pneumoniae OMC; lane 2 and 4 
pre-serum; lane 3 and 5 polyclonal rabbit anti pEXl-1 fusion protein; lane 6 MAb 26. 1 . 
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